What Does Maxdegreeofparallelism Do

What does MaxDegreeOfParallelism do?

The answer is that it is the upper limit for the entire parallel operation, irrespective of the number of cores.

So even if you don't use the CPU because you are waiting on IO, or a lock, no extra tasks will run in parallel, only the maximum that you specifiy.

To find this out, I wrote this piece of test code. There is an artificial lock in there to stimulate the TPL to use more threads. The same will happen when your code is waiting for IO or database.

class Program
{
static void Main(string[] args)
{
var locker = new Object();
int count = 0;
Parallel.For
(0
, 1000
, new ParallelOptions { MaxDegreeOfParallelism = 2 }
, (i) =>
{
Interlocked.Increment(ref count);
lock (locker)
{
Console.WriteLine("Number of active threads:" + count);
Thread.Sleep(10);
}
Interlocked.Decrement(ref count);
}
);
}
}

If I don't specify MaxDegreeOfParallelism, the console logging shows that up to around 8 tasks are running at the same time. Like this:

Number of active threads:6
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:6
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7

It starts lower, increases over time and at the end it is trying to run 8 at the same time.

If I limit it to some arbitrary value (say 2), I get

Number of active threads:2
Number of active threads:1
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2

Oh, and this is on a quadcore machine.

What is the meaning of the MaxDegreeOfParallelism = -1 in Parallel operations in .NET 6?

The definition is deliberately states as -1 means that the number of number of concurrent operations will not be artificially limited. and it doesn't say that all actions will start immediately.

The thread pool manager normally keeps the number of available threads at the number of cores (or logical processor which are 2x number of cores) and this is considered the optimum number of threads (I think this number is [number of cores/logical processor + 1]) . This means that when you start executing your actions the number of available threads to immediately start work is this number.

Thread pool manager runs periodically (twice a second) and a if none of the threads have completed a new one is added (or removed in the reverse situation when there are too many threads).

A good experiment to see this in action is too run your experiment twice in quick succession. In the first instance the number of concurrent jobs at the beginning should be around number of cores/logical processor + 1 and in 2nd run it should be the number of jobs run (because these threads were created to service the first run:

Here's a modified version of your code:

using System.Diagnostics;

Stopwatch sw = Stopwatch.StartNew();
int concurrency = 0;
Action action = new Action(() =>
{
var current = Interlocked.Increment(ref concurrency);
Console.WriteLine(@$"Started at {sw.ElapsedMilliseconds} with concurrency {current}");
Thread.Sleep(10_000);
current = Interlocked.Decrement(ref concurrency);
});

Action[] actions = Enumerable.Repeat(action, 12).ToArray();
var options = new ParallelOptions() { MaxDegreeOfParallelism = -1 };
Parallel.Invoke(options, actions);

Parallel.Invoke(options, actions);

Output:

Started at 114 with concurrency 8
Started at 114 with concurrency 1
Started at 114 with concurrency 2
Started at 114 with concurrency 3
Started at 114 with concurrency 4
Started at 114 with concurrency 6
Started at 114 with concurrency 5
Started at 114 with concurrency 7
Started at 114 with concurrency 9
Started at 1100 with concurrency 10
Started at 2097 with concurrency 11
Started at 3100 with concurrency 12
Started at 13110 with concurrency 1
Started at 13110 with concurrency 2
Started at 13110 with concurrency 3
Started at 13110 with concurrency 5
Started at 13110 with concurrency 7
Started at 13110 with concurrency 9
Started at 13110 with concurrency 10
Started at 13110 with concurrency 11
Started at 13110 with concurrency 4
Started at 13110 with concurrency 12
Started at 13110 with concurrency 6
Started at 13110 with concurrency 8

My computer has 4 cores (8 logical processors) and we when the jobs run on a "cold" TaskScheduler.Default at first 8+1 of them are started immediately and after that a new thread is added periodically.

Then, when running the second batch "hot" then all jobs start at the same time.

Parallel.ForEachAsync

When a similar example is run with Parallel.ForEachAsync the behaviour is different. The work is done at a constant level of paralellism. Please not that this is not about threads because if you await Task.Delay (so not blocking the thread`) the number of parallel jobs stays the same.

If wee peek at the source code for the version taking ParallelOptions it passes parallelOptions.EffectiveMaxConcurrencyLevel as dop to the private method which does the real work.

public static Task ForEachAsync<TSource>(IEnumerable<TSource> source!!, ParallelOptions parallelOptions!!, Func<TSource, CancellationToken, ValueTask> body!!)
{
return ForEachAsync(source, parallelOptions.EffectiveMaxConcurrencyLevel, ...);
}

If we peek further we can see that:

  • "dop" is documented as 'A integer indicating how many operations to allow to run in parallel.'.
  • the actual level of parallelism is DefaultDegreeOfParallelism.
/// <param name="dop">A integer indicating how many operations to allow to run in parallel.</param>
(...)
private static Task ForEachAsync<TSource>(IEnumerable<TSource> source, int dop,
{
...

if (dop < 0)
{
dop = DefaultDegreeOfParallelism;
}

One last peek, and we can see the final value is Environment.ProcessorCount.

private static int DefaultDegreeOfParallelism => Environment.ProcessorCount;

This is what it is now and I am not sure if this will stay like this in .NET 7.

How does MaxDegreeOfParallelism work?

MaxDegreeOfParallelism refers to the maximum number of worker tasks that will be scheduled at any one time by a parallel loop.

The degree of parallelism is automatically managed by the implementation of the Parallel class, the default task scheduler, and the .NET thread pool. The throughput is optimized under a wide range of conditions.

For very large degree of parallelism, you may also want to use the ThreadPool class’s SetMinThreads method so that these threads are created without delay. If you don't do this then the thread pool’s thread injection algorithm may limit how quickly threads can be added to the pool of worker threads that is used by the parallel loop. It may take more time than you want to create the required number of threads.

Is ParallelOptions.MaxDegreeOfParallelism applied globally over multiple concurrent Parallel calls?

ParallelOptions.MaxDegreeOfParallelism is not applied globally. If you have enough cores, and the scheduler sees fit you will get a multiplication of the nested MPD values with each For able to spin up that many tasks (if the workloads are unconstrained).

Consider this example, 3 tasks can start 3 more tasks. This is limited by the MDP option of 3.

int k = 0;
ParallelOptions po = new ParallelOptions();
po.MaxDegreeOfParallelism = 3;

Parallel.For(0, 10, po, (i) =>
{
Parallel.For(0, 10, po, (j) =>
{
Interlocked.Increment(ref k);
Console.WriteLine(k);
Thread.Sleep(2000);
Interlocked.Decrement(ref k);
});
Thread.Sleep(2000);
});

Output

1
2
3
4
7
5
6
8
9
9
5
6
7
9
9
8
8
9
...

If MDP was global you would only get 3 I guess, since it's not you get 9s.

ParallelOptions with MaxDegreeOfParallelism and actual number of threads

MaxDegreeOfParallelism specifies how many threads (or rather, tasks) can run at the same time. It doesn't specify which threads that should be.

From the docs:

Gets or sets the maximum number of concurrent tasks enabled by this ParallelOptions instance.

During your execution of Parallel.ForEach you may run on many different threads (and therefore involve many different thread id's). That depends on the underlying thread scheduler.

You are, however, guaranteed that your block of code will at most have MaxDegreeOfParallelism executions at the same time.

While you basically have an example already, here is a slightly modified version of your code to show what I mean.

var dict = new ConcurrentDictionary<int, string>();
var po = new ParallelOptions { MaxDegreeOfParallelism = 3 };
int count = 0, maxval = 0;
Parallel.ForEach(Enumerable.Range(1, 10000000), po, (d) =>
{
Interlocked.Increment(ref count);
dict.AddOrUpdate(Thread.CurrentThread.ManagedThreadId, "", (id, old) => old);

lock (dict)
{
maxval = Math.Max(maxval, count);
}
Interlocked.Decrement(ref count);
});

Console.WriteLine("Count: " + count);
Console.WriteLine("Max: " + maxval);
Console.WriteLine("Thread ids: " + String.Join(", ", dict.Select(d => d.Key)));

It should result in an output similar to this:

Count: 0

Max: 3

Thread ids: 1, 4, 5, 6, 7, 8, 9

No matter how many times you run this code, the max value should never go above 3. On the other hand, the thread ids will change often.

Will Parallel.ForEach process in order with MaxDegreeOfParallelism=1?

First, it is correct that Microsoft's official documentation on parallel programming states that the execution order is not guaranteed.

The Parallel.ForEach method does not guarantee the order of execution. Unlike a sequential ForEach loop, the incoming values aren't always processed in order.

It would be best to use Parallel.ForEach as the public API is designed: to process items in a parallel manner. If you need to process items sequentially, you're much better off using a regular foreach loop. The intent is clearer than using MaxDegreeOfParallelism = 1.

With that being said, for curiosity's sake, I took a look at the source code for .NET 4.7.1. The short answer is yes, the items will be processed sequentially if MaxDegreeOfParallelism = 1. However, you shouldn't rely on this for future implementations, because it may not always be this way.

  1. Taking a look at Parallel.ForEach and following it through, you'll eventually see that the collection to be iterated over is partitioned (this process is slightly different whether it is a TSource[], List<TSource>, or an IEnumerable<TSource>.

  2. Task.SavedStateForNextReplica and Task.SavedStateFromPreviousReplica are overridden in ParallelForReplicaTask in order to communicate state between tasks running in parallel. In this case, they are used to communicate which partition the task should iterate over.

  3. Finally, let's take a look at Task.ExecuteSelfReplicating. ParallelForReplicatingTask overrides ShouldReplicate based on the degree of parallelism specified as well as the task scheduler's MaximumConcurrencyLevel. So, this with MaxDegreeOfParallelism = 1 will only create a single child task. As such, this task will only operate over the single partition which was created.

So, to answer your question: as of writing, Parallel.ForEach with MaxDegreeOfParallism = 1 will enumerate the collection from beginning to end for a TSource[], from beginning to end for an IList<TSource>, and use GetEnumerator for an IEnumerable<TSource>, with slightly different paths depending on if the IEnumerable<TSource> can be cast to an OrderablePartitioner<TSource> or not. These three paths are determined in Parallel.ForEachWorker.

I strongly encourage you to browse through the source code on your own to see for yourself.

I hope this is able to answer your question, but it's really important to remember: don't rely on this. It is very possible that this implementation can change in the future.

Parallel.ForEach MaxDegreeOfParallelism Strange Behavior with Increasing Chunking

You can't use Parallel methods with async delegates - at least, not yet.

Since you already have a "pipeline" style of architecture, I recommend looking into TPL Dataflow. A single ActionBlock may be all that you need, and once you have that working, other blocks in TPL Dataflow may replace other parts of your pipeline.

If you prefer to stick with your existing buffer, then you should use asynchronous concurrency instead of Parallel:

private void Process() {
var throttler = new SemaphoreSlim(8);
var tasks = _buffer.GetConsumingEnumerable()
.Select(async report =>
{
await throttler.WaitAsync();
try {
await _handler.ProcessAsync(report).ConfigureAwait(false);
} catch (Exception e) {
if (_config.IsDevelopment) {
throw;
}

_logger.LogError(e, "GPS Report Service");
}
finally {
throttler.Release();
}
})
.ToList();
await Task.WhenAll(tasks);
}


Related Topics



Leave a reply



Submit