Why Does the Ef 6 Tutorial Use Asynchronous Calls

Why does the EF 6 tutorial use asynchronous calls?

In order to decide whether to go async or sync, compare the benefits and costs:

Async:

  • Almost never exhaust the thread-pool with async (the circumstances would have to be extreme)
  • Pretty much arbitrary levels of concurrency (concurrent requests and operations)
  • Saves 1MB of memory per thread save
  • Safe intra-request concurrency thanks to the SynchronizationContext
  • Can increase throughput by low double-digit percentages for high-load cases due to reducing OS scheduling overhead. That said, almost no production app is under high CPU load because if it was it was close to unavailability (in case of a load spike the app starts dropping requests)

Sync:

  • Simpler code: await makes 99% of the cases (almost) as simple as synchronous code. That said, the 10+ async questions each day on Stack Overflow speak a different language. Edge cases arise when you deviate from the simple path. Also when using legacy libraries that, for example, require you to hand them a synchronous callback
  • Less work for coding and debugging
  • Profiler-friendly (You can profile the app or just pause the debugger and see what the app is doing right now. Not possible with async.)
  • Interoperates perfectly with legacy code and libraries

Choose async with ASP.NET if you are calling high-latency services. A web service is likely to be high latency. An OLTP database is almost always low-latency.

Choose async if your application benefits from very high levels of concurrency (100+). Most applications do not have such high levels, or their back-end services would not sustain such an amount of load. No point in making the web app scale but overload the back-end. All systems in the call chain must benefit from a high degree of concurrency in order for async to be beneficial.

Typical high-latency services (good cases for async):

  • Web-services
  • Waiting (e.g. sleep)
  • Throttling (SemaphoreSlim, ...)
  • Some cloud services (Azure)
  • Long-running queries to the database (e.g. reporting or ETL)

Typical low-latency services (good cases for sync):

  • Database calls: Most OLTP queries are low-latency because you can assume the database server to not be overloaded. No point in throwing 100s of concurrent queries at it. Doesn't make them complete any faster.
  • File system: The same as databases.

These are categorized by the typical case. All of these can be in the opposite category as well.

You can mix sync and async in the same app. Use async when it is at its sweet spot.

So why are Microsoft and the Entity Framework team promoting async usage? Here comes the subjective part of this answer: It might be Microsoft's internal policy. They might anticipate EF usage in client apps (for which async is great). Or, they don't realize that async database calls are pretty much almost always a waste of developers' time without benefits. Most people don't realize this because async is the way to go these days.

Entity Framework 6 - Enforce asynchronous queries, compile time prevent synchronous calls

Following up to this... i never found a solution that can detect this at compile time, but I was able to do this in code in the DataContext:

        public EfMyCustomContext(string connctionString)
: base(string.Format(CONNECTION_STRING, connctionString))
{
#if DEBUG
this.Database.Log = LogDataBaseCall;
#endif
}

#if DEBUG
private void LogDataBaseCall(string s)
{
if (s.Contains("Executing "))
{
if (!s.Contains("asynchronously"))
{
// This code was not executed asynchronously
// Please look at the stack trace, and identify what needs
// to be loaded. Note, an entity.SomeOtherEntityOrCollection can be loaded
// with the eConnect API call entity.SomeOtherEntityOrCollectionLoadAsync() before using the
// object that is going to hit the sub object. This is the most common mistake
// and this breakpoint will help you identify all synchronous code.
// eConnect does not want any synchronous code in the code base.
System.Diagnostics.Debugger.Break();
}
}
}
#endif

Hope this helps someone else, and still would love if there was some option during compile.

EF6 Async Methods Confusion

From the same EF docs you quoted:

For the moment, EF will detect if the developer attempts to execute
two async operations at one time and throw.

So, this code should work even if there's a thread switch after await, because it's still executed sequentially:

var dbContext = new DbContext();
var something = await dbContext.someEntities.FirstOrDefaultAsync(e => e.Id == 1);
var morething = await dbContext.someEntities.FirstOrDefaultAsync(e => e.Id == 2);

At least, this is how it is expected to work. If sequential execution still produces a threading-related exception, this should be reported as an EF bug.

On the other hand, the following code will most likely fail, because we introduce parallelism:

var dbContext = new DbContext();
var somethingTask = dbContext.someEntities.FirstOrDefaultAsync(e => e.Id == 1);
var morethingTask = dbContext.someEntities.FirstOrDefaultAsync(e => e.Id == 2);

await Task.WhenAll(somethingTask, morethingTask);

var something = somethingTask.Result;
var morething = morethingTask.Result;

You need to make sure you don't use the same DbContext with more than one pending EF operation.

Updated, the 1st code fragment actually works fine with EF v6.1.0, as expected.

ASPNet Entity Framework 6 - EF6, mixing async and sync in the same unit of work

Your code, as it is, is "correct", meaning it does not have a fundamental flaw.

Let's analyze what happens, and my 2 cents on them :

  • When doing your query to the db (the FirstAsync), the use of await means the instructions will be executed sequentially, but the thread won't be blocked (asynchronous call) and can be used for other things. This is probably a good thing, if this is a server application you may want your threads be able to process other requests while waiting for the DB response.

  • The use of non-async SaveChanges will however block the thread. This in itself is not an error, but since it is also an I/O operation on db, it might block the thread for some significant time. It would indeed seem more "consistent" if you also await with an async version here as well.

Is that an issue ? It depends. Is your application heavily used ? Are your users experiencing some low reactivity on heavy load ? Then in this case it might be a potential improvement to use async on save.

Otherwise, and as suggested on the linked MSDN article, until you know it has an impact, my advice would be to not worry too much.

The mix of usages is not per se a problem.

Personnally, I would go for an async SaveChanges as well, for consistency and imagining that a DB write is some I/O operation that could be slow.

If it's a desktop application, just make sure that you are not blocking the UI thread.

Multi-async in Entity Framework 6?

The exception explains clearly that there is only one asynchronous operation per context allowed at a time.

So, you either have to await them one at a time as the error message suggests:

var banner = await context.Banners.ToListAsync();
var newsGroup = await context.NewsGroups.ToListAsync();

Or you can use multiple contexts:

var banner = context1.Banners.ToListAsync();
var newsGroup = context2.NewsGroups.ToListAsync();
await Task.WhenAll(banner, newsGroup);

EF Data Context - Async/Await & Multithreading

We have a stalemate situation here. AspNetSynchronizationContext, which is responsible for the threading model of an ASP.NET Web API execution environment, does not guarantee that asynchronous continuation after await will take place on the same thread. The whole idea of this is to make ASP.NET apps more scalable, so less threads from ThreadPool are blocked with pending synchronous operations.

However, the DataContext class (part of LINQ to SQL )
is not thread-safe, so it shouldn't be used where a thread switch may potentially occurr across DataContext API calls. A separate using construct per asynchronous call will not help, either:

var something;
using (var dataContext = new DataContext())
{
something = await dataContext.someEntities.FirstOrDefaultAsync(e => e.Id == 1);
}

That's because DataContext.Dispose might be executed on a different thread from the one the object was originally created on, and this is not something DataContext would expect.

If you like to stick with the DataContext API, calling it synchronously appears to be the only feasible option. I'm not sure if that statement should be extended to the whole EF API, but I suppose any child objects created with DataContext API are probably not thread-safe, either. Thus, in ASP.NET their using scope should be limited to that of between two adjacent await calls.

It might be tempting to offload a bunch of synchronous DataContext calls to a separate thread with await Task.Run(() => { /* do DataContext stuff here */ }). However, that'd be a known anti-pattern, especially in the context of ASP.NET where it might only hurt performance and scalability, as it would not reduce the number of threads required to fulfill the request.

Unfortunately, while the asynchronous architecture of ASP.NET is great, it remains being incompatible with some established APIs and patterns (e.g., here is a similar case).
That's especially sad, because we're not dealing with concurrent API access here, i.e. no more than one thread is trying to access a DataContext object at the same time.

Hopefully, Microsoft will address that in the future versions of the Framework.

[UPDATE] On a large scale though, it might be possible to offload the EF logic to a separate process (run as a WCF service) which would provide a thread-safe async API to the ASP.NET client logic. Such process can be orchestrated with a custom synchronization context as an event machine, similar to Node.js. It may even run a pool of Node.js-like apartments, each apartment maintaining the thread affinity for EF objects. That would allow to still benefit from the async EF API.

[UPDATE] Here is some attempt to find a solution to this problem.

Async/Await with Entity Framework 6.1.1 and impersonation

Set the legacyImpersonationPolicy to false and alwaysFlowImpersonationPolicy to true inside your web.config and restart IIS

<configuration>
<runtime>
<legacyImpersonationPolicy enabled="false"/>
<alwaysFlowImpersonationPolicy enabled="true"/>
</runtime>
</configuration>

Entity Framework async operation takes ten times as long to complete

I found this question very interesting, especially since I'm using async everywhere with Ado.Net and EF 6. I was hoping someone to give an explanation for this question, but it doesn't happened. So I tried to reproduce this problem on my side. I hope some of you will find this interesting.

First good news : I reproduced it :) And the difference is enormous. With a factor 8 ...

first results

First I was suspecting something dealing with CommandBehavior, since I read an interesting article about async with Ado, saying this :

"Since non-sequential access mode has to store the data for the entire row, it can cause issues if you are reading a large column from the server (such as varbinary(MAX), varchar(MAX), nvarchar(MAX) or XML)."

I was suspecting ToList() calls to be CommandBehavior.SequentialAccess and async ones to be CommandBehavior.Default (non-sequential, which can cause issues). So I downloaded EF6's sources, and put breakpoints everywhere (where CommandBehavior where used, of course).

Result : nothing. All the calls are made with CommandBehavior.Default .... So I tried to step into EF code to understand what happens... and.. ooouch... I never see such a delegating code, everything seems lazy executed...

So I tried to do some profiling to understand what happens...

And I think I have something...

Here's the model to create the table I benchmarked, with 3500 lines inside of it, and 256 Kb random data in each varbinary(MAX). (EF 6.1 - CodeFirst - CodePlex) :

public class TestContext : DbContext
{
public TestContext()
: base(@"Server=(localdb)\\v11.0;Integrated Security=true;Initial Catalog=BENCH") // Local instance
{
}
public DbSet<TestItem> Items { get; set; }
}

public class TestItem
{
public int ID { get; set; }
public string Name { get; set; }
public byte[] BinaryData { get; set; }
}

And here's the code I used to create the test data, and benchmark EF.

using (TestContext db = new TestContext())
{
if (!db.Items.Any())
{
foreach (int i in Enumerable.Range(0, 3500)) // Fill 3500 lines
{
byte[] dummyData = new byte[1 << 18]; // with 256 Kbyte
new Random().NextBytes(dummyData);
db.Items.Add(new TestItem() { Name = i.ToString(), BinaryData = dummyData });
}
await db.SaveChangesAsync();
}
}

using (TestContext db = new TestContext()) // EF Warm Up
{
var warmItUp = db.Items.FirstOrDefault();
warmItUp = await db.Items.FirstOrDefaultAsync();
}

Stopwatch watch = new Stopwatch();
using (TestContext db = new TestContext())
{
watch.Start();
var testRegular = db.Items.ToList();
watch.Stop();
Console.WriteLine("non async : " + watch.ElapsedMilliseconds);
}

using (TestContext db = new TestContext())
{
watch.Restart();
var testAsync = await db.Items.ToListAsync();
watch.Stop();
Console.WriteLine("async : " + watch.ElapsedMilliseconds);
}

using (var connection = new SqlConnection(CS))
{
await connection.OpenAsync();
using (var cmd = new SqlCommand("SELECT ID, Name, BinaryData FROM dbo.TestItems", connection))
{
watch.Restart();
List<TestItem> itemsWithAdo = new List<TestItem>();
var reader = await cmd.ExecuteReaderAsync(CommandBehavior.SequentialAccess);
while (await reader.ReadAsync())
{
var item = new TestItem();
item.ID = (int)reader[0];
item.Name = (String)reader[1];
item.BinaryData = (byte[])reader[2];
itemsWithAdo.Add(item);
}
watch.Stop();
Console.WriteLine("ExecuteReaderAsync SequentialAccess : " + watch.ElapsedMilliseconds);
}
}

using (var connection = new SqlConnection(CS))
{
await connection.OpenAsync();
using (var cmd = new SqlCommand("SELECT ID, Name, BinaryData FROM dbo.TestItems", connection))
{
watch.Restart();
List<TestItem> itemsWithAdo = new List<TestItem>();
var reader = await cmd.ExecuteReaderAsync(CommandBehavior.Default);
while (await reader.ReadAsync())
{
var item = new TestItem();
item.ID = (int)reader[0];
item.Name = (String)reader[1];
item.BinaryData = (byte[])reader[2];
itemsWithAdo.Add(item);
}
watch.Stop();
Console.WriteLine("ExecuteReaderAsync Default : " + watch.ElapsedMilliseconds);
}
}

using (var connection = new SqlConnection(CS))
{
await connection.OpenAsync();
using (var cmd = new SqlCommand("SELECT ID, Name, BinaryData FROM dbo.TestItems", connection))
{
watch.Restart();
List<TestItem> itemsWithAdo = new List<TestItem>();
var reader = cmd.ExecuteReader(CommandBehavior.SequentialAccess);
while (reader.Read())
{
var item = new TestItem();
item.ID = (int)reader[0];
item.Name = (String)reader[1];
item.BinaryData = (byte[])reader[2];
itemsWithAdo.Add(item);
}
watch.Stop();
Console.WriteLine("ExecuteReader SequentialAccess : " + watch.ElapsedMilliseconds);
}
}

using (var connection = new SqlConnection(CS))
{
await connection.OpenAsync();
using (var cmd = new SqlCommand("SELECT ID, Name, BinaryData FROM dbo.TestItems", connection))
{
watch.Restart();
List<TestItem> itemsWithAdo = new List<TestItem>();
var reader = cmd.ExecuteReader(CommandBehavior.Default);
while (reader.Read())
{
var item = new TestItem();
item.ID = (int)reader[0];
item.Name = (String)reader[1];
item.BinaryData = (byte[])reader[2];
itemsWithAdo.Add(item);
}
watch.Stop();
Console.WriteLine("ExecuteReader Default : " + watch.ElapsedMilliseconds);
}
}

For the regular EF call (.ToList()), the profiling seems "normal" and is easy to read :

ToList trace

Here we find the 8.4 seconds we have with the Stopwatch (profiling slow downs the perfs). We also find HitCount = 3500 along the call path, which is consistent with the 3500 lines in the test. On the TDS parser side, things start to became worse since we read 118 353 calls on TryReadByteArray() method, which is were the buffering loop occurs. (an average 33.8 calls for each byte[] of 256kb)

For the async case, it's really really different.... First, the .ToListAsync() call is scheduled on the ThreadPool, and then awaited. Nothing amazing here. But, now, here's the async hell on the ThreadPool :

ToListAsync hell

First, in the first case we were having just 3500 hit counts along the full call path, here we have 118 371. Moreover, you have to imagine all the synchronization calls I didn't put on the screenshoot...

Second, in the first case, we were having "just 118 353" calls to the TryReadByteArray() method, here we have 2 050 210 calls ! It's 17 times more... (on a test with large 1Mb array, it's 160 times more)

Moreover there are :

  • 120 000 Task instances created
  • 727 519 Interlocked calls
  • 290 569 Monitor calls
  • 98 283 ExecutionContext instances, with 264 481 Captures
  • 208 733 SpinLock calls

My guess is the buffering is made in an async way (and not a good one), with parallel Tasks trying to read data from the TDS. Too many Task are created just to parse the binary data.

As a preliminary conclusion, we can say Async is great, EF6 is great, but EF6's usages of async in it's current implementation adds a major overhead, on the performance side, the Threading side, and the CPU side (12% CPU usage in the ToList() case and 20% in the ToListAsync case for a 8 to 10 times longer work... I run it on an old i7 920).

While doings some tests, I was thinking about this article again and I notice something I miss :

"For the new asynchronous methods in .Net 4.5, their behavior is exactly the same as with the synchronous methods, except for one notable exception: ReadAsync in non-sequential mode."

What ?!!!

So I extend my benchmarks to include Ado.Net in regular / async call, and with CommandBehavior.SequentialAccess / CommandBehavior.Default, and here's a big surprise ! :

with ado

We have the exact same behavior with Ado.Net !!! Facepalm...

My definitive conclusion is : there's a bug in EF 6 implementation. It should toggle the CommandBehavior to SequentialAccess when an async call is made over a table containing a binary(max) column. The problem of creating too many Task, slowing down the process, is on the Ado.Net side. The EF problem is that it doesn't use Ado.Net as it should.

Now you know instead of using the EF6 async methods, you would better have to call EF in a regular non-async way, and then use a TaskCompletionSource<T> to return the result in an async way.

Note 1 : I edited my post because of a shameful error.... I've done my first test over the network, not locally, and the limited bandwidth have distorted the results. Here are the updated results.

Note 2 : I didn't extends my test to other uses cases (ex : nvarchar(max) with a lot of data), but there are chances the same behavior happens.

Note 3 : Something usual for the ToList() case, is the 12% CPU (1/8 of my CPU = 1 logical core). Something unusual is the maximum 20% for the ToListAsync() case, as if the Scheduler could not use all the Treads. It's probably due to the too many Task created, or maybe a bottleneck in TDS parser, I don't know...



Related Topics



Leave a reply



Submit