Java8 ForkJoinPool和Executors.newWorkStealingPool之间的详细区别?
使用之间的底层差异是什么?
What is the low-level difference among using:
ForkJoinPool = new ForkJoinPool(X);
和
ExecutorService ex = Executors.neWorkStealingPool(X);
其中 X 是所需的并行度,即线程正在运行.
Where X is the desired level of parallelism i.e threads running..
根据文档,我发现它们相似.还请告诉我,在任何正常使用情况下,哪种更合适,更安全. 我有 130 个条目要写入BufferedWriter并使用Unix按第一列排序.
According to the docs I found them similar. Also tell me which one is more appropriate and safe under any normal uses. I have 130 million entries to write into a BufferedWriter and Sort them using Unix sort by 1st column.
如果可能的话,还请让我知道要保留多少个线程.
Also let me know how many threads to keep if possible.
注意: :我的系统具有 8 个核心处理器和 32 GB RAM.
Note: My System has 8 core processors and 32 GB RAM.
工作窃取是现代线程池使用的一种技术,目的是减少工作队列上的争用.
Work stealing is a technique used by modern thread-pools in order to decrease contention on the work queue.
经典线程池只有一个队列,并且每个线程池线程都会锁定队列,使任务出队,然后解锁队列.如果任务很短并且有很多任务,那么队列中就会有很多争用.在这里使用无锁队列确实有帮助,但不能完全解决问题.
A classical threadpool has one queue, and each thread-pool-thread locks the queue, dequeue a task and then unlocks the queue. If the tasks are short and there are many of them, there is a lot of contention on the queue. Using a lock-free queue really helps here, but doesn't solve the problem entirely.
现代线程池使用偷工作-每个线程都有自己的队列.当线程池线程产生任务时,它将任务排队到他自己的队列中.当线程池线程想要使任务出队时-它首先尝试将任务从他自己的队列中出队,如果没有,则从其他线程队列中窃取"工作.这确实减少了线程池的争用并提高了性能.
Modern thread pools use work stealing - each thread has its own queue. When a threadpool thread produces a task - it enqueues it to his own queue. When a threadpool thread wants to dequeue a task - it first tries to dequeue a task out of his own queue and if it doesn't have any - it "steals" work from other thread queues. This really decreases the contention of the threadpool and improves performance.
newWorkStealingPool
使用线程数作为处理器数来创建一个利用工作效率的线程池.
newWorkStealingPool
creates a workstealing-utilizing thread pool with the number of threads as the number of processors.
newWorkStealingPool
提出了一个新问题.如果我有四个逻辑核心,则该池将总共有四个线程.如果我的任务阻塞(例如在同步IO上),则我的CPU使用率不足.我想要的是例如在任意给定时刻的四个活动线程-四个对AES进行加密的线程和另外140个等待IO完成的线程.
newWorkStealingPool
presents a new problem. If I have four logical cores, then the pool will have four threads total. If my tasks block - for example on synchronous IO - I don't utilize my CPUs enough. What I want is four active threads at any given moment, for example - four threads which encrypt AES and another 140 threads which wait for the IO to finish.
这是ForkJoinPool
提供的功能-如果您的任务产生了新任务,而该任务等待它们完成,则该池将注入新的活动线程以使CPU饱和.值得一提的是,ForkJoinPool
也利用偷窃工作.
This is what ForkJoinPool
provides - if your task spawns new tasks and that task waits for them to finish - the pool will inject new active threads in order to saturate the CPU. It is worth mentioning that ForkJoinPool
utilizes work stealing too.
使用哪个?如果您使用fork-join模型,或者您无限期地知道任务阻塞,请使用ForkJoinPool
.如果您的任务很短并且大部分都是CPU限制,请使用newWorkStealingPool
.
Which one to use? If you work with the fork-join model or you know your tasks block indefinitely, use the ForkJoinPool
. If your tasks are short and are mostly CPU-bound, use newWorkStealingPool
.
话虽如此,现代应用程序倾向于使用具有可用处理器数量的线程池,并使用异步IO 和无锁容器,以便防止阻塞. (通常)可以提供最佳性能.
And after anything has being said, modern applications tend to use thread pool with the number of processors available and utilize asynchronous IO and lock-free-containers in order to prevent blocking. this (usually) gives the best performance.