使用任务并行库进行 I/O 绑定处理

问题描述:

想知道您是否可以澄清.

Wondering if you could clarify.

我正在编写一个工具,所有要做的就是从数据库 (sql server) 中检索数据并创建 txt 文件.我说的是 500.000 个 txt 文件.

I am writing a tool that all has todo is retrieve data from a database (sql server) and create txt files. I am talking 500.000 txt files.

一切顺利.

但是我想知道使用 Task Parallel 库是否可以改进和加快创建这些文件所需的时间.

However I was wondering if using Task Parallel library could improve and speed up the time it takes to create these files.

我知道(阅读)TPL"不打算用于 I/0 绑定处理,并且很可能它会执行与顺序.

I know (read) that "TPL" is not meant to be used for I/0 bound processing and that most likely it will perform the same as sequential .

这是真的吗?

同样在使用简单的foreach parallel"的初始尝试中,我收到一个错误,无法访问文件,因为正在使用.

Also in an initial attempt using a simple "foreach parallel" I was getting an error cannot access file because is in use.

有什么建议吗?

您不要并行 I/O 绑定进程.

You do not parallel I/O bound processes.

原因很简单:因为CPU不是瓶颈.不管你启动多少线程,你只有一个磁盘可以写入,这是最慢的.

The reason is simple: because CPU is not the bottleneck. No matter you start how many threads, You only have ONE disk to write to, and that is the slowest thing.

所以你需要做的是简单地迭代每个文件并写入它们.您可以启动一个单独的工作线程来完成这项工作,或者使用异步 I/O 来获得更好的 UI 响应.

So what you need to is to simply iterate every file and write them. You can start a seperate working thread doing this work, or using async I/O to get a better UI response.