How to Correctly Write Parallel.For With Async Methods

How to correctly write Parallel.For with async methods

Parallel.For() doesn't work well with async methods. If you don't need to limit the degree of parallelism (i.e. you're okay with all of the tasks executing at the same time), you can simply start all the Tasks and then wait for them to complete:

var tasks = Enumerable.Range(0, elevations.Count())
.Select(i => BuildSheetsAsync(userID, elevations[i], includeLabels));
List<Bitmap> allSheets = (await Task.WhenAll(tasks)).SelectMany(x => x).ToList();

async/await and Parallel.For loop

First of all, neither version makes any sense. Parallel.For is for running CPU-bound (or possibly blocking IO-bound) operations. You're using it for starting asynchronous operations.

You're not waiting for the operation to complete, and you say that's intentional, but it's also dangerous: if an exception happens in ReadSensorsAsync, you have no way of catching it.

Since starting an async operation should be very fast, to start many of them at once, you don't need Parallel.For, you can use normal for:

for (int i = 3; i < 38; i++)
{
ReadSensorsAsync(i);
}

(But again, I do no recommend ignoring the returned Task.)


As for timing, the big difference is probably because you're ignoring warmup: when you call ReadSensorsAsync for the first time, it has to be JIT compiled, which for such simple operations, will skew the results significantly.

Here are numbers from my machine, the format is "running for the first time"; "running for the second time":

  • calling ReadSensorsAsync once (for comparison): 7.6 ms; 0.04 ms
  • for: 7.5 ms; 0.05 ms
  • Parallel.For without await: 8.0 ms; 0.5 ms
  • Parallel.For with await: 11 ms; 2.6 ms

As you can see, using Parallel.For only adds overhead. And using it with await adds even more overhead, because starting an async method requires creating a state machine, which takes some time.

Write an async method with Parallel.Foreach loop that does call another async method to pull record

Well for one thing you can pretend that Parallel.ForEach awaits your async functions, but it doesn't. Instead you want to write something like this:

   await Task.WhenAll(customers.Select(async customer =>
{
var processedCustomer = await MethodB(customer);
inboundCustomersFiles.AddRange(processedCustomer);
}));

Task.WhenAll behaves like Parallel.ForEach, but it's awaitable and it also awaits every task you pass to it before completing its own task. Hence when your await Task.WhenAll completes, all the inner tasks have completely completed as well.

the process in methodB customerRecord takes time

That is very ambiguous. If you mean it takes server and/or IO time then that's fine, that's what async is for. If you mean it takes your CPU time (ie it processes data locally for a long time), then you should spin up a task on a thread pool and await its completion. Note that this is not necessarily the default thread pool! Especially if you're writing an ASP.NET (core) application, you want a dedicated thread pool just for this stuff.

Running async methods in parallel

Is there a better to run async methods in parallel, or are tasks a good approach?

Yes, the "best" approach is to utilize the Task.WhenAll method. However, your second approach should have ran in parallel. I have created a .NET Fiddle, this should help shed some light. Your second approach should actually be running in parallel. My fiddle proves this!

Consider the following:

public Task<Thing[]> GetThingsAsync()
{
var first = GetExpensiveThingAsync();
var second = GetExpensiveThingAsync();

return Task.WhenAll(first, second);
}

Note

It is preferred to use the "Async" suffix, instead of GetThings and GetExpensiveThing - we should have GetThingsAsync and GetExpensiveThingAsync respectively - source.

Effects of async within a parallel for loop

In your code there is no part that is executed asynchrounously.

  • In MainCaller, you start a Task and immediately Wait for it to finished.
    This is a blocking operation which only introduces the extra overhead of calling
    GetAllLists in another Task.
  • In this Task you call You start a new Task (by calling GettAllLists) but immediately
    wait for this Task to finish by waiting for its Result (which is also blocking).
  • In the Task started by GetAllLists you have the Parallel.Foreach loop which starts
    several new Tasks. Each of these 'for' Tasks will start another Task by calling
    MyMethod and immediately waiting for its result.

The net result is that your code completely executes synchronously. The only parallelism is introduced in the Parallel.For loop.

Hint: a usefull thread concerning this topic: Using async/await for multiple tasks

Additionally your code contains a serious bug:
Each Task created by the Parallel.For loop will eventually add its partial List to the ReturnList by calling AddRange. 'AddRange' is not thread safe, so you need to have some synchronisation mechanism (e.g. 'Lock') or there is the possibility that your ReturnList gets corrupted or does not contain all the results. See also: Is the List<T>.AddRange() thread safe?

Parallel foreach with asynchronous lambda

If you just want simple parallelism, you can do this:

var bag = new ConcurrentBag<object>();
var tasks = myCollection.Select(async item =>
{
// some pre stuff
var response = await GetData(item);
bag.Add(response);
// some post stuff
});
await Task.WhenAll(tasks);
var count = bag.Count;

If you need something more complex, check out Stephen Toub's ForEachAsync post.

Passing async method into Parallel.ForEach

This code works only because DoAsyncJob isn't really an asynchronous method. async doesn't make a method work asynchronously. Awaiting a completed task like that returned by Task.FromResult is synchronous too. async Task Main doesn't contain any asynchronous code, which results in a compiler warning.

An example that demonstrates how Parallel.ForEach doesn't work with asynchronous methods should call a real asynchronous method:

    static async Task Main(string[] args)
{
var results = new ConcurrentDictionary<string, int>();

Parallel.ForEach(Enumerable.Range(0, 100), async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
Console.WriteLine($"Items in dictionary {results.Count}");
}

static async Task<int> DoAsyncJob(int i)
{
await Task.Delay(100);
return i * 10;
}

The result will be

Items in dictionary 0

Parallel.ForEach has no overload accepting a Func<Task>, it accepts only Action delegates. This means it can't await any asynchronous operations.

async index is accepted because it's implicitly an async void delegate. As far as Parallel.ForEach is concerned, it's just an Action<int>.

The result is that Parallel.ForEach fires off 100 tasks and never waits for them to complete. That's why the dictionary is still empty when the application terminates.

Async method parallel execution

Use await Task.WhenAll(task1, task2, task3); to get them to execute in parallel.

At the moment, you are still executing the tasks in a serial fashion by awaiting each one individually.

Edit:

To demonstrate what Panagiotis means about "there are no tasks in this code":

Without tasks:

class Program
{
static void Main(string[] args)
{
var mainTask = MainAsync(args);
mainTask.GetAwaiter().GetResult();

}

static async Task MainAsync(string[] args)
{
var task1 = Method1();
var task2 = Method2();
var task3 = Method3();

await Task.WhenAll(task1, task2, task3);

Console.ReadLine();
}


public static async Task<long> Method1()
{
var returnValue = 0l;
for (long i = 0; i < 5; i++)
{
Console.WriteLine($"Method 1 : {i}");
returnValue += i;
}

return await Task.FromResult(returnValue);
}

public static async Task<long> Method2()
{
var returnValue = 0l;
for (long i = 0; i < 5; i++)
{
Console.WriteLine($"Method 2 : {i}");
returnValue += i;
}

return await Task.FromResult(returnValue);
}

public static async Task<long> Method3()
{
var returnValue = 0l;
for (long i = 0; i < 5; i++)
{
Console.WriteLine($"Method 3 : {i}");
returnValue += i;
}

return await Task.FromResult(returnValue);
}

}

Output:

Method 1 : 0
Method 1 : 1
Method 1 : 2
Method 1 : 3
Method 1 : 4
Method 2 : 0
Method 2 : 1
Method 2 : 2
Method 2 : 3
Method 2 : 4
Method 3 : 0
Method 3 : 1
Method 3 : 2
Method 3 : 3
Method 3 : 4

With tasks (i.e. properly await a real asynchronous action):

class Program
{
static void Main(string[] args)
{
var mainTask = MainAsync(args);
mainTask.GetAwaiter().GetResult();

}

static async Task MainAsync(string[] args)
{
var task1 = Method1();
var task2 = Method2();
var task3 = Method3();

await Task.WhenAll(task1, task2, task3);

Console.ReadLine();
}


public static async Task<long> Method1()
{
var returnValue = 0l;
for (long i = 0; i < 5; i++)
{
Console.WriteLine($"Method 1 : {i}");
returnValue += i;
await Task.Delay(1);
}

return await Task.FromResult(returnValue);
}

public static async Task<long> Method2()
{
var returnValue = 0l;
for (long i = 0; i < 5; i++)
{
Console.WriteLine($"Method 2 : {i}");
returnValue += i;
await Task.Delay(1);
}

return await Task.FromResult(returnValue);
}

public static async Task<long> Method3()
{
var returnValue = 0l;
for (long i = 0; i < 5; i++)
{
Console.WriteLine($"Method 3 : {i}");
returnValue += i;
await Task.Delay(1);
}

return await Task.FromResult(returnValue);
}

}

Output:

    Method 1 : 0
Method 2 : 0
Method 1 : 1
Method 3 : 0
Method 3 : 1
Method 1 : 2
Method 2 : 1
Method 3 : 2
Method 1 : 3
Method 2 : 2
Method 2 : 3
Method 1 : 4
Method 3 : 3
Method 3 : 4
Method 2 : 4


Related Topics



Leave a reply



Submit