Creating Threads - Task.Factory.Startnew VS New Thread()

Creating threads - Task.Factory.StartNew vs new Thread()

There is a big difference. Tasks are scheduled on the ThreadPool and could even be executed synchronous if appropiate.

If you have a long running background work you should specify this by using the correct Task Option.

You should prefer Task Parallel Library over explicit thread handling, as it is more optimized. Also you have more features like Continuation.

What is difference between Task.Factory.StartNew and new Thread().Start()?

  • Task.Factory.StartNew : Starts a new task that will run in a thread pool thread or may run in the same thread. If it is ran in a thread pool thread, the thread is returned to the pool when done. Thread creation/destruction is an expensive process.

  • new Thread().Start() : Will always run in a new thread, therefore, it is more expensive.

The exception means that there is an ongoing data set being readed from somewhere else, and you are trying to open another data reader at the same time. When using tasks, maybe you are executing one task after the other, and that is why you don't get the exception.

The exception is not threading related. You can get the same exception by opening a data reader, and try to open a new one without closing the first.

I would suggest to review your code considering that, and ensure you need threading before use it. Multithreading overuse creates performance problems and incredibly ugly bugs.

Is Task.Factory.StartNew() guaranteed to use another thread than the calling thread?

I mailed Stephen Toub - a member of the PFX Team - about this question. He's come back to me really quickly, with a lot of detail - so I'll just copy and paste his text here. I haven't quoted it all, as reading a large amount of quoted text ends up getting less comfortable than vanilla black-on-white, but really, this is Stephen - I don't know this much stuff :) I've made this answer community wiki to reflect that all the goodness below isn't really my content:

If you call Wait() on a Task that's completed, there won't be any blocking (it'll just throw an exception if the task completed with a TaskStatus other than RanToCompletion, or otherwise return as a nop). If you call Wait() on a Task that's already executing, it must block as there’s nothing else it can reasonably do (when I say block, I'm including both true kernel-based waiting and spinning, as it'll typically do a mixture of both). Similarly, if you call Wait() on a Task that has the Created or WaitingForActivation status, it’ll block until the task has completed. None of those is the interesting case being discussed.

The interesting case is when you call Wait() on a Task in the WaitingToRun state, meaning that it’s previously been queued to a TaskScheduler but that TaskScheduler hasn't yet gotten around to actually running the Task's delegate yet. In that case, the call to Wait will ask the scheduler whether it's ok to run the Task then-and-there on the current thread, via a call to the scheduler's TryExecuteTaskInline method. This is called inlining. The scheduler can choose to either inline the task via a call to base.TryExecuteTask, or it can return 'false' to indicate that it is not executing the task (often this is done with logic like...

return SomeSchedulerSpecificCondition() ? false : TryExecuteTask(task);

The reason TryExecuteTask returns a Boolean is that it handles the synchronization to ensure a given Task is only ever executed once). So, if a scheduler wants to completely prohibit inlining of the Task during Wait, it can just be implemented as return false; If a scheduler wants to always allow inlining whenever possible, it can just be implemented as:

return TryExecuteTask(task);

In the current implementation (both .NET 4 and .NET 4.5, and I don’t personally expect this to change), the default scheduler that targets the ThreadPool allows for inlining if the current thread is a ThreadPool thread and if that thread was the one to have previously queued the task.

Note that there isn't arbitrary reentrancy here, in that the default scheduler won’t pump arbitrary threads when waiting for a task... it'll only allow that task to be inlined, and of course any inlining that task in turn decides to do. Also note that Wait won’t even ask the scheduler in certain conditions, instead preferring to block. For example, if you pass in a cancelable CancellationToken, or if you pass in a non-infinite timeout, it won’t try to inline because it could take an arbitrarily long amount of time to inline the task's execution, which is all or nothing, and that could end up significantly delaying the cancellation request or timeout. Overall, TPL tries to strike a decent balance here between wasting the thread that’s doing the Wait'ing and reusing that thread for too much. This kind of inlining is really important for recursive divide-and-conquer problems (e.g. QuickSort) where you spawn multiple tasks and then wait for them all to complete. If such were done without inlining, you’d very quickly deadlock as you exhaust all threads in the pool and any future ones it wanted to give to you.

Separate from Wait, it’s also (remotely) possible that the Task.Factory.StartNew call could end up executing the task then and there, iff the scheduler being used chose to run the task synchronously as part of the QueueTask call. None of the schedulers built into .NET will ever do this, and I personally think it would be a bad design for scheduler, but it’s theoretically possible, e.g.:

protected override void QueueTask(Task task, bool wasPreviouslyQueued)
{
return TryExecuteTask(task);
}

The overload of Task.Factory.StartNew that doesn’t accept a TaskScheduler uses the scheduler from the TaskFactory, which in the case of Task.Factory targets TaskScheduler.Current. This means if you call Task.Factory.StartNew from within a Task queued to this mythical RunSynchronouslyTaskScheduler, it would also queue to RunSynchronouslyTaskScheduler, resulting in the StartNew call executing the Task synchronously. If you’re at all concerned about this (e.g. you’re implementing a library and you don’t know where you’re going to be called from), you can explicitly pass TaskScheduler.Default to the StartNew call, use Task.Run (which always goes to TaskScheduler.Default), or use a TaskFactory created to target TaskScheduler.Default.


EDIT: Okay, it looks like I was completely wrong, and a thread which is currently waiting on a task can be hijacked. Here's a simpler example of this happening:

using System;
using System.Threading;
using System.Threading.Tasks;

namespace ConsoleApplication1 {
class Program {
static void Main() {
for (int i = 0; i < 10; i++)
{
Task.Factory.StartNew(Launch).Wait();
}
}

static void Launch()
{
Console.WriteLine("Launch thread: {0}",
Thread.CurrentThread.ManagedThreadId);
Task.Factory.StartNew(Nested).Wait();
}

static void Nested()
{
Console.WriteLine("Nested thread: {0}",
Thread.CurrentThread.ManagedThreadId);
}
}
}

Sample output:

Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 3
Nested thread: 3
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4
Launch thread: 4
Nested thread: 4

As you can see, there are lots of times when the waiting thread is reused to execute the new task. This can happen even if the thread has acquired a lock. Nasty re-entrancy. I am suitably shocked and worried :(

Newly created threads using Task.Factory.StartNew starts very slowly

Found out that the thread pool can be unwilling to start more than one new thread every 500 msec when the number of thread pool threads used are over a specific value. However increasing MinThreads using ThreadPool.SetMinThreads - even though it is not recommended - to 100 enables me to create 100 threads without the 500 msec delay.

Here's what helped me:

  • http://alexpinsker.blogspot.com/2009/06/threadpool.html

  • http://msdn.microsoft.com/en-us/library/system.threading.threadpool.setminthreads%28v=vs.100%29.aspx

  • https://stackoverflow.com/a/13186389/600559

Edit:

Here's what I ended doing in App.xaml.cs (in the constructor):

// Get thread pool information
int workerThreadsMin, completionPortThreadsMin;
ThreadPool.GetMinThreads(out workerThreadsMin, out completionPortThreadsMin);
int workerThreadsMax, completionPortThreadsMax;
ThreadPool.GetMaxThreads(out workerThreadsMax, out completionPortThreadsMax);

// Adjust min threads
ThreadPool.SetMinThreads(workerThreadsMax, completionPortThreadsMin);

Is Task.Factory.StartNew() guaranteed to create at least one new thread?

It depends on what you mean by "immediately" but I think it's reasonable to assume that the TPL isn't going to hijack your currently executing thread to synchronously run the code in your task, if that's what you mean. At least not with the normal scheduler... you could probably write your own scheduler which does do so, but you can normally assume that StartNew will schedule the task rather than just running it inline.

difference between new thread and task start new?

On key difference is that the Task approach will utilise the thread pool.

This is important as it means that you will only be creating as many threads as absolutely necessary. Where possible, existing threads will be re-used, giving the performance benefit of not having to create fresh threads.

If you are creating lots of threads, for relatively short running operations the above benefit becomes more important. If, however, the operation is one or just a few, long running operations, the benefit it less.

Task.Factory.StartNew starts with a great delay despite having available threads in threadpool

It's not the MAX worker threads value you need to look at - it's the MIN value you get via ThreadPool.GetMinThreads().

The max value is the absolute maximum threads that can be active. The min value is the number to always keep active. If you try to start a thread when the number of active threads is less than max (and greater than min) you'll see a 2 second delay.

You can change the minimum number of threads if absolutely necessary (which it is in some circumstances) but generally speaking if you find yourself needing to do that, you might need to think about redesigning your multithreading so that you don't need to.

As the Microsoft documentation states:

By default, the minimum number of threads is set to the number of processors on a system. You can use the SetMinThreads method to increase the minimum number of threads. However, unnecessarily increasing these values can cause performance problems. If too many tasks start at the same time, all of them might appear to be slow. In most cases, the thread pool will perform better with its own algorithm for allocating threads. Reducing the minimum to less than the number of processors can also hurt performance.

ThreadPool.QueueUserWorkItem vs Task.Factory.StartNew

If you're going to start a long-running task with TPL, you should specify TaskCreationOptions.LongRunning, which will mean it doesn't schedule it on the thread-pool. (EDIT: As noted in comments, this is a scheduler-specific decision, and isn't a hard and fast guarantee, but I'd hope that any sensible production scheduler would avoid scheduling long-running tasks on a thread pool.)

You definitely shouldn't schedule a large number of long-running tasks on the thread pool yourself. I believe that these days the default size of the thread pool is pretty large (because it's often abused in this way) but fundamentally it shouldn't be used like this.

The point of the thread pool is to avoid short tasks taking a large hit from creating a new thread, compared with the time they're actually running. If the task will be running for a long time, the impact of creating a new thread will be relatively small anyway - and you don't want to end up potentially running out of thread pool threads. (It's less likely now, but I did experience it on earlier versions of .NET.)

Personally if I had the option, I'd definitely use TPL on the grounds that the Task API is pretty nice - but do remember to tell TPL that you expect the task to run for a long time.

EDIT: As noted in comments, see also the PFX team's blog post on choosing between the TPL and the thread pool:

In conclusion, I’ll reiterate what the CLR team’s ThreadPool developer has already stated:

Task is now the preferred way to queue work to the thread pool.

EDIT: Also from comments, don't forget that TPL allows you to use custom schedulers, if you really want to...

Regarding usage of Task.Start() , Task.Run() and Task.Factory.StartNew()

Task.Run is a shorthand for Task.Factory.StartNew with specific safe arguments:

Task.Factory.StartNew(
action,
CancellationToken.None,
TaskCreationOptions.DenyChildAttach,
TaskScheduler.Default);

It was added in .Net 4.5 to help with the increasingly frequent usage of async and offloading work to the ThreadPool.

Task.Factory.StartNew (added with TPL in .Net 4.0) is much more robust. You should only use it if Task.Run isn't enough, for example when you want to use TaskCreationOptions.LongRunning (though it's unnecessary when the delegate is async. More on that on my blog: LongRunning Is Useless For Task.Run With async-await). More on Task.Factory.StartNew in Task.Run vs Task.Factory.StartNew

Don't ever create a Task and call Start() unless you find an extremely good reason to do so. It should only be used if you have some part that needs to create tasks but not schedule them and another part that schedules without creating. That's almost never an appropriate solution and could be dangerous. More in "Task.Factory.StartNew" vs "new Task(...).Start"

In conclusion, mostly use Task.Run, use Task.Factory.StartNew if you must and never use Start.



Related Topics



Leave a reply



Submit