Parallel.Foreach VS Task.Factory.Startnew

Parallel.ForEach vs Task.Factory.StartNew

The first is a much better option.

Parallel.ForEach, internally, uses a Partitioner<T> to distribute your collection into work items. It will not do one task per item, but rather batch this to lower the overhead involved.

The second option will schedule a single Task per item in your collection. While the results will be (nearly) the same, this will introduce far more overhead than necessary, especially for large collections, and cause the overall runtimes to be slower.

FYI - The Partitioner used can be controlled by using the appropriate overloads to Parallel.ForEach, if so desired. For details, see Custom Partitioners on MSDN.

The main difference, at runtime, is the second will act asynchronous. This can be duplicated using Parallel.ForEach by doing:

Task.Factory.StartNew( () => Parallel.ForEach<Item>(items, item => DoSomething(item)));

By doing this, you still take advantage of the partitioners, but don't block until the operation is complete.

Parallel.ForEach vs Task.Run and Task.WhenAll

In this case, the second method will asynchronously wait for the tasks to complete instead of blocking.

However, there is a disadvantage to use Task.Run in a loop- With Parallel.ForEach, there is a Partitioner which gets created to avoid making more tasks than necessary. Task.Run will always make a single task per item (since you're doing this), but the Parallel class batches work so you create fewer tasks than total work items. This can provide significantly better overall performance, especially if the loop body has a small amount of work per item.

If this is the case, you can combine both options by writing:

await Task.Run(() => Parallel.ForEach(strings, s =>
{
    DoSomething(s);
}));

Note that this can also be written in this shorter form:

await Task.Run(() => Parallel.ForEach(strings, DoSomething));

Task.Factory.StartNew vs. Parallel.Invoke

The most important difference between these two is that Parallel.Invoke will wait for all the actions to complete before continuing with the code, whereas StartNew will move on to the next line of code, allowing the tasks to complete in their own good time.

This semantic difference should be your first (and probably only) consideration. But for informational purposes, here's a benchmark:

/* This is a benchmarking template I use in LINQPad when I want to do a
 * quick performance test. Just give it a couple of actions to test and
 * it will give you a pretty good idea of how long they take compared
 * to one another. It's not perfect: You can expect a 3% error margin
 * under ideal circumstances. But if you're not going to improve
 * performance by more than 3%, you probably don't care anyway.*/
void Main()
{
    // Enter setup code here
    var actions2 =
    (from i in Enumerable.Range(1, 10000)
    select (Action)(() => {})).ToArray();

    var awaitList = new Task[actions2.Length];
    var actions = new[]
    {
        new TimedAction("Task.Factory.StartNew", () =>
        {
            // Enter code to test here
            int j = 0;
            foreach(var action in actions2)
            {
                awaitList[j++] = Task.Factory.StartNew(action);
            }
            Task.WaitAll(awaitList);
        }),
        new TimedAction("Parallel.Invoke", () =>
        {
            // Enter code to test here
            Parallel.Invoke(actions2);
        }),
    };
    const int TimesToRun = 100; // Tweak this as necessary
    TimeActions(TimesToRun, actions);
}


#region timer helper methods
// Define other methods and classes here
public void TimeActions(int iterations, params TimedAction[] actions)
{
    Stopwatch s = new Stopwatch();
    int length = actions.Length;
    var results = new ActionResult[actions.Length];
    // Perform the actions in their initial order.
    for(int i = 0; i < length; i++)
    {
        var action = actions[i];
        var result = results[i] = new ActionResult{Message = action.Message};
        // Do a dry run to get things ramped up/cached
        result.DryRun1 = s.Time(action.Action, 10);
        result.FullRun1 = s.Time(action.Action, iterations);
    }
    // Perform the actions in reverse order.
    for(int i = length - 1; i >= 0; i--)
    {
        var action = actions[i];
        var result = results[i];
        // Do a dry run to get things ramped up/cached
        result.DryRun2 = s.Time(action.Action, 10);
        result.FullRun2 = s.Time(action.Action, iterations);
    }
    results.Dump();
}

public class ActionResult
{
    public string Message {get;set;}
    public double DryRun1 {get;set;}
    public double DryRun2 {get;set;}
    public double FullRun1 {get;set;}
    public double FullRun2 {get;set;}
}

public class TimedAction
{
    public TimedAction(string message, Action action)
    {
        Message = message;
        Action = action;
    }
    public string Message {get;private set;}
    public Action Action {get;private set;}
}

public static class StopwatchExtensions
{
    public static double Time(this Stopwatch sw, Action action, int iterations)
    {
        sw.Restart();
        for (int i = 0; i < iterations; i++)
        {
            action();
        }
        sw.Stop();

        return sw.Elapsed.TotalMilliseconds;
    }
}
#endregion

Results:

Message               | DryRun1 | DryRun2 | FullRun1 | FullRun2
----------------------------------------------------------------
Task.Factory.StartNew | 43.0592 | 50.847  | 452.2637 | 463.2310
Parallel.Invoke       | 10.5717 |  9.948  | 102.7767 | 101.1158

As you can see, using Parallel.Invoke can be roughly 4.5x faster than waiting for a bunch of newed-up tasks to complete. Of course, that's when your actions do absolutely nothing. The more each action does, the less of a difference you'll notice.

Task.StartNew() vs Parallel.ForEach : Multiple Web Requests Scenario

makes the web request calls (unrelated, so could be fired in parallel)

What you actually want is to call them concurrently, not in parallel. That is, "at the same time", not "using multiple threads".

The existing code appears to consume too many threads

Yeah, I think so too. :)

Considering this is all "Async IO" work and not "CPU bound" work

Then it should all be done asynchronously, and not using task parallelism or other parallel code.

As Antii pointed out, you should make your asynchronous code asynchronous:

public async Task ProcessRequestAsync(...);

Then what you want to do is consume it using asynchronous concurrency (Task.WhenAll), not parallel concurrency (StartNew/Run/Parallel):

await Task.WhenAll(list.Select(x => ProcessRequestAsync(x)));

Task.Factory.StartNew or Parallel.ForEach for many long-running tasks?

Perhaps you aren't aware of this, but the members in the Parallel class are simply (complicated) wrappers around Task objects. In case you're wondering, the Parallel class creates the Task objects with TaskCreationOptions.None. However, the MaxDegreeOfParallelism would affect those task objects no matter what creation options were passed to the task object's constructor.

TaskCreationOptions.LongRunning gives a "hint" to the underlying TaskScheduler that it might perform better with oversubscription of the threads. Oversubscription is good for threads with high-latency, for example I/O, because it will assign more than one thread (yes thread, not task) to a single core so that it will always have something to do, instead of waiting around for an operation to complete while the thread is in a waiting state. On the TaskScheduler that uses the ThreadPool, it will run LongRunning tasks on their own dedicated thread (the only case where you have a thread per task), otherwise it will run normally, with scheduling and work stealing (really, what you want here anyway)

MaxDegreeOfParallelism controls the number of concurrent operations run. It's similar to specifying the max number of paritions that the data will be split into and processed from. If TaskCreationOptions.LongRunning were able to be specified, all this would do would be to limit the number of tasks running at a single time, similar to a TaskScheduler whose maximum concurrency level is set to that value, similar to this example.

You might want the Parallel.ForEach. However, adding MaxDegreeOfParallelism equal to such a high number actually won't guarantee that there will be that many threads running at once, since the tasks will still be controlled by the ThreadPoolTaskScheduler. That scheduler will the number of threads running at once to the smallest amount possible, which I suppose is the biggest difference between the two methods. You could write (and specify) your own TaskScheduler that would mimic the max degree of parallelism behavior, and have the best of both worlds, but I'm doubting that something you're interested in doing.

My guess is that, depending on latency and the number of actual requests you need to do, using tasks will perform better in many(?) cases, though wind up using more memory, while parallel will be more consistent in resource usage. Of course, async I/O will perform monstrously better than any of these two options, but I understand you can't do that because you're using legacy libraries. So, unfortunately, you'll be stuck with mediocre performance no matter which one of those you chose.

A real solution would be to figure out a way to make async I/O happen; since I don't know the situation, I don't think I can be more helpful than that. Your program (read, thread) will continue execution, and the kernel will wait for the I/O operation to complete (this is also known as using I/O completion ports). Because the thread is not in a waiting state, the runtime can do more work on less threads, which usually ends up in an optimal relationship between the number of cores and number of threads. Adding more threads, as much as I wish it would, does not equate to better performance (actually, it can often hurt performance, because of things like context switching).

However, this entire answer is useless in a determining a final answer for your question, though I hope it will give you some needed direction. You won't know what performs better until you profile it. If you don't try them both (I should clarify that I mean the Task without the LongRunning option, letting the scheduler handle thread switching) and profile them to determine what is best for your particular use case, you're selling yourself short.

Parallel.Foreach vs Foreach and Task in local variable

You do not define any local variable in Parallel.ForEach - item is nothing more than a formal parameter - the implementation code of Parallel.ForEach is the one that will have to handle variables, and whether they are local, captured or something else.

There is no need to define a local variable related to the formal parameter Parallel.ForEach - the caller code of your anonymous delegate will handle the variable and pass it to your function.

However in C#4, you might need to use a local variable if you capture another variable, that is:

void DoSomething(ItemType item, OtherType other) {
}

void YourFunction(IEnumerable<ItemType> items, IEnumerable<OtherType> others) {

    foreach (var otherItem in others) {
        var localOtherItem = otherItem;
        Parallel.ForEach(items, item => DoSomething(item, localOtherItem));
    }
}

You can see the difference above: localOtherItem is taken from the context where the anonymous function is defined: that is called a closure. Whereas the items in items are passed simply as a method parameter to the anonymous function.

In short: the item in Parallel.ForEach and the item in C# foreach are two very different problems.

Task Factory for each loop with await

This is a typical problem that C# 8.0 Async Streams are going to solve very soon.

Until C# 8.0 is released, you can use the AsyncEnumarator library:

using System.Collections.Async;

class Program
{
    private async Task SQLBulkLoader() {

        await indicators.file_list.ParallelForEachAsync(async fileListObj =>
        {
            ...
            await s.WriteToServerAsync(dataTableConversion);
            ...
        },
        maxDegreeOfParalellism: 3,
        cancellationToken: default);
    }

    static void Main(string[] args)
    {
        Program worker = new Program();
        worker.SQLBulkLoader().GetAwaiter().GetResult();
    }
}

I do not recommend using Parallel.ForEach and Task.WhenAll as those functions are not designed for asynchronous streaming.

Task.StartNew Parallel.ForEach doesn't await

Parallel.ForEach doesn't work with async. It expects an Action but in order to await the async method it needs to get a Func<Task>.

You can use TPL Dataflow's ActionBlock that was build with async in mind instead. You give it a delegate (async or not) to perform on each item. You configure the block's parallelism (and bounded capacity if necessary). And you post your items into it:

var block = new ActionBlock<string>(async url => 
{
    Uri uri = new Uri(url);
    string filename = System.IO.Path.GetFileName(uri.LocalPath);

    using (HttpClient client = new HttpClient())
    using (HttpResponseMessage response = await client.GetAsync(url))
    using (HttpContent content = response.Content)
    {
       // ... Read the string.
       using (var fileStream = new FileStream(config.M_F_P + filename, FileMode.Create, FileAccess.Write))
       {
           await content.CopyToAsync(fileStream);
       }
    }
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 2 } );

foreach (var url in urls)
{
    block.Post(url);
}

block.Complete();
await block.Completion;
// done

Manual threads vs Parallel.Foreach in task scheduler

This answer is relevant if Task class in your code has nothing to do with System.Threading.Tasks.Task.

As a simple rule, use Parallel.ForEach to run tasks that will end eventually. Like execute some work in parallel with some other work

Use Threads when they run routine for the whole life of application.

So, it looks like in your case you should use Threads approach.

Parallel.Foreach VS Task.Factory.Startnew