Does Parallel.Foreach Limit the Number of Active Threads

Does Parallel.ForEach limit the number of active threads?

No, it won't start 1000 threads - yes, it will limit how many threads are used. Parallel Extensions uses an appropriate number of cores, based on how many you physically have and how many are already busy. It allocates work for each core and then uses a technique called work stealing to let each thread process its own queue efficiently and only need to do any expensive cross-thread access when it really needs to.

Have a look at the PFX Team Blog for loads of information about how it allocates work and all kinds of other topics.

Note that in some cases you can specify the degree of parallelism you want, too.

Count number of threads used by Parallel.ForEach

You could use a (thread-safe) list to store the IDs of the used threads and count them:

ConcurrentBag<int> threadIDs = new ConcurrentBag<int>();
Parallel.ForEach(myList, item => {
threadIDs.Add(Thread.CurrentThread.ManagedThreadId);
doStuff(item);
});

int usedThreads = threadIDs.Distinct().Count();

This does have a performance impact (especially the thread-safety logic of ConcurrentBag), but I can't tell how big that is. The relative effect depends on how much work doStuff does itself. If that method has only a few commands, this thread counting solution may even change the number of used threads.

How can I limit Parallel.ForEach?

You can specify a MaxDegreeOfParallelism in a ParallelOptions parameter:

Parallel.ForEach(
listOfWebpages,
new ParallelOptions { MaxDegreeOfParallelism = 4 },
webpage => { Download(webpage); }
);

MSDN: Parallel.ForEach

MSDN: ParallelOptions.MaxDegreeOfParallelism

Parallel.ForEach() - How many threads?

From documentation:

The MaxDegreeOfParallelism limits the number of concurrent operations
run by Parallel method calls that are passed this ParallelOptions
instance to the set value, if it is positive. If
MaxDegreeOfParallelism is -1, then there is no limit placed on the
number of concurrently running operations.

This function throw only ArgumentOutOfRangeException and type of MaxDegreeOfParalleism is int. So it means, that it can create more threads than count of cores.

I can confirm it, because I've ran a lot of xml generation processes and Task Manager shown count of processes exactly as I've set before.

Why isn't Parallel.ForEach running multiple threads?

It's by design that Parallel.ForEach may use fewer threads than requested to achieve better performance. According to MSDN [link]:

By default, the Parallel.ForEach and Parallel.For methods can use a variable number of tasks. That's why, for example, the ParallelOptions class has a MaxDegreeOfParallelism property instead of a "MinDegreeOfParallelism" property. The idea is that the system can use fewer threads than requested to process a loop.

The .NET thread pool adapts dynamically to changing workloads by allowing the number of worker threads for parallel tasks to change over time. At run time, the system observes whether increasing the number of threads improves or degrades overall throughput and adjusts the number of worker threads accordingly.

Number of threads being used during Parallel.ForEach

This behavior is non deterministic from the outsideview.

The Parallel class uses Threadpools. There are some decissions made on when to create new threads and when to use old ones. This also gives you performance and reduces the required amount of memorry.

There for after the parallel foreach some threads could (temporarly) still exist , which could be used for the threadpool activities. But as far as I know you cannot predict, if or how many threads will continue to exist.

Should I always use Parallel.Foreach because more threads MUST speed up everything?

No, it doesn't make sense for every foreach. Some reasons:

  • Your code may not actually be parallelizable. For example, if you're using the "results so far" for the next iteration and the order is important)
  • If you're aggregating (e.g. summing values) then there are ways of using Parallel.ForEach for this, but you shouldn't just do it blindly
  • If your work will complete very fast anyway, there's no benefit, and it may well slow things down

Basically nothing in threading should be done blindly. Think about where it actually makes sense to parallelize. Oh, and measure the impact to make sure the benefit is worth the added complexity. (It will be harder for things like debugging.) TPL is great, but it's no free lunch.



Related Topics



Leave a reply



Submit