C#多线程应用程序-结构?

C#多线程应用程序-结构?

问题描述:

因此,我将创建一个用于检查链接是否可访问(实时)的应用程序. 我的问题是如何使线程总是很忙".我的意思是: 该应用程序运行100个带有100个不同URL的线程(例如,使用FOR循环创建).因此,当其中一个线程完成时,它的工作(检查URL是否可用)以获取新的URL并立即重新开始.因此,在检查完所有URL之前,这100个线程将不间断地工作.

So, I'll make an application for checking links if they're accessible(live). My question is how to make the threads "always busy". What I mean: The app run 100 threads(created with FOR loop for example) with 100 different URLs. So when 1 of the threads finish it's job(check if URL is available) to get new URL and start again immediately. So the 100 threads will work non-stop till all URLs are checked.

我该怎么做?

您正在寻找的被称为生产者-消费者模型.您有一个资源池,其中包含要检查的url列表,一个线程可以填充该池,并且如果您使用.NET 4

What you are looking for is called the Producer-Consumer Model. You have a pool of resources, that contains the list of urls to check, one thread can fill that pool, and your conumer threads can pull from that pool, if you have .NET 4 Parallel.ForEach does most of the work for you.

使用100个线程也可能不是最佳线程数,只需让任务并行库为您管理线程数即可.

Using 100 threads also is very likely not going to be the optimum number of threads, just let the Task Parallel Library manage the thread count for you.

这里是一个示例,该列表将被预先填充并且在线程运行时不添加更多项目.

Here is a example if the list will be pre-populated and not have more items added as the thread is running.

//Parallel.Foreach will block until it is done so you may want to run this function on a background worker.
public void StartThreads()
{
    List<string> myListOfUrls = GetUrls();

    Parallel.Foreach(myListOfUrls, ProcessUrl);
}


private void ProcessUrl(string url)
{
    //Do your work here, this code will be run from multiple threads.
}

如果需要在运行时填充集合,请用

If you need to populate the collection as it runs, replace List<string> with a concurrent collection like BlockingCollection

BlockingCollection<string> myListOfUrls = new BlockingCollection();

//Parallel.Foreach will block until it is done so you may want to run this function on a background worker.
public void StartThreads()
{
    if(myListOfUrls.IsComplete == true)
    {
        //The collection has emptied itself and you told it you where done using it, you will either need to throw a exception or make a new collection.
        //use IsCompleatedAdding to check to see if you told it that you are done with it, but there still may be members left to process.
        throw new InvalidOperationException();
    }

    //We create a Partitioner to remove the buffering behavior of Parallel.ForEach, this gives better performance with a BlockingCollection.
    var partitioner = Partitioner.Create(myListOfUrls.GetConsumingEnumerable(), EnumerablePartitionerOptions.NoBuffering);
    Parallel.ForEach(partitioner, ProcessUrl);
}

public void StopThreads()
{
    myListOfUrls.CompletedAdding()
}

public void AddUrl(string url)
{
    myListOfUrls.Add(url);
}

private void ProcessUrl(string url)
{
    //Do your work here, this code will be run from multiple threads.
}


我还想补充一点,自动线程调度可能也不是最好的,它可能会增加一些限制,请参见原始问题的注释


I also wanted to add that the automated thread scheduling may not be the best also, it may put some limits that could be expanded on, see this comment from the original question

对于那些说/赞成100个线程的人来说,这是一个糟糕的主意:在我的双 核心2GB RAM XP机器Parallel.Foreach从未创建超过5个 线程(除非我设置ThreadPool.SetMinThreads)并创建100 线程总是使操作速度提高约30-40%.所以不要走 一切都交给Parallel.Foreach. PS:我的测试代码WebClient wc = new WebClient(); var s = wc.DownloadString(url); (Google主页)– L.B

For those, who said/upvoted 100 thread is a terrible idea: On my dual core 2GB RAM XP machine Parallel.Foreach never created more than 5 threads(unless I set ThreadPool.SetMinThreads) and creating 100 threads resulted always ~30-40% faster operation. So don't leave everything to Parallel.Foreach . PS: My test code WebClient wc = new WebClient();var s = wc.DownloadString(url); (google's home page) – L.B