Why Is Inserting Entities in Ef 4.1 So Slow Compared to Objectcontext

Why is inserting entities in EF 4.1 so slow compared to ObjectContext?

As already indicated by Ladislav in the comment, you need to disable automatic change detection to improve performance:

context.Configuration.AutoDetectChangesEnabled = false;

This change detection is enabled by default in the DbContext API.

The reason why DbContext behaves so different from the ObjectContext API is that many more functions of the DbContext API will call DetectChanges internally than functions of the ObjectContext API when automatic change detection is enabled.

Here you can find a list of those functions which call DetectChanges by default. They are:

  • The Add, Attach, Find, Local, or Remove members on DbSet
  • The GetValidationErrors, Entry, or SaveChanges members on DbContext
  • The Entries method on DbChangeTracker

Especially Add calls DetectChanges which is responsible for the poor performance you experienced.

I contrast to this the ObjectContext API calls DetectChanges only automatically in SaveChanges but not in AddObject and the other corresponding methods mentioned above. That's the reason why the default performance of ObjectContext is faster.

Why did they introduce this default automatic change detection in DbContext in so many functions? I am not sure, but it seems that disabling it and calling DetectChanges manually at the proper points is considered as advanced and can easily introduce subtle bugs into your application so use [it] with care.

DbContext is very slow when adding and deleting

Try to add this to your DbContext tests:

dbContext.Configuration.AutoDetectChangesEnabled = false;

// Now do all your changes

dbContext.ChangeTracker.DetectChanges();
dbContext.SaveChanges();

and try to run your tests again.

There was some architectural change in DbContext API which checks changes in entities every time you Add, Attach or Delete anything from the context. In ObjectContext API this detection run only when you triggered SaveChanges. It is better solution for most common scenarios but it requires special handling for mass data processing.

Adding an object to the entity framework context takes about 1.5 seconds

As requested in the comments (in case this isn't closed a duplicate), the slowdown was related to automatic change detection, which is on by default in the DbContext API.

To disable automatic change detection:

context.Configuration.AutoDetectChangesEnabled = false;

A much more complete/full description (which I certainly can't better here) can be found in this accepted answer:

Why is inserting entities in EF 4.1 so slow compared to ObjectContext?

What causes .Attach() to be slow in EF4?

I can confirm this slow behaviour and I also found the main reason. I've made a little test with the following model ...

public class MyClass
{
public int Id { get; set; }
public string P1 { get; set; }
// ... properties P2 to P49, all of type string
public string P50 { get; set; }
}

public class MyContext : DbContext
{
public DbSet<MyClass> MyClassSet { get; set; }
}

... and this test program ...

using (var context = new MyContext())
{
var list = new List<MyClass>();
for (int i = 0; i < 1000; i++)
{
var m = new MyClass()
{
Id = i+1,
P1 = "Some text ....................................",
// ... initialize P2 to P49, all with the same text
P50 = "Some text ...................................."
}
list.Add(m);
}

Stopwatch watch = new Stopwatch();
watch.Start();
foreach (var entity in list)
{
context.Set<MyClass>().Attach(entity);
context.Entry(entity).State = System.Data.EntityState.Modified;
}
watch.Stop();
long time = watch.ElapsedMilliseconds;
}

Test 1

Exactly the code above:

--> time = 29,2 sec


Test 2

Comment out the line ...

//context.Entry(entity).State = System.Data.EntityState.Modified;

--> time = 15,3 sec


Test 3

Comment out the line ...

//context.Set<MyClass>().Attach(entity);

--> time = 57,3 sec

This result is very strange because I expected that calling Attach is not necessary because changing the state attaches anyway.


Test 4

Remove properties P6 to P50 (so we only have 5 strings in the entity), original code:

--> time = 3,4 sec

So, yes, obviously the number of properties strongly matters.


Test 5

Add the following line before the loop (model again with all 50 properties):

context.Configuration.AutoDetectChangesEnabled = false;

--> time = 1,4 sec


Test 6

Again with AutoDetectChangesEnabled = false but with only 5 properties:

--> time = 1,3 sec

So, without change tracking the number of properties doesn't matter so much anymore.


Conclusion

By far most of the time seems to be spent for taking the snapshot of the attached object's properties by the change tracking mechanism. If you don't need it disable change tracking for your code snippet. (I guess in your code your really don't need change tracking because by setting the entitiy's state to Modified you basically mark all properties as changed anyway. So all columns get sent to the database in an update statement.)

Edit

The test times above are in Debug mode. But Release mode doesn't make a big difference (for instance: Test 1 = 28,7 sec, Test 5 = 0,9 sec).

Inserting many rows with Entity Framework is extremely slow

One easy method is by using the EntityFramework.BulkInsert extension.

You can then do:

// Add all workers to database
var workforce = allWorkers.Values
.Select(i => new Worker
{
Reference = i.EMPLOYEE_REF,
Skills = i.GetSkills().Select(s => dbSkills[s]).ToArray(),
DefaultRegion = "wa",
DefaultEfficiency = i.TECH_EFFICIENCY
});

db.BulkInsert(workforce);

Entity Framework is Too Slow. What are my options?

You should start by profiling the SQL commands actually issued by the Entity Framework. Depending on your configuration (POCO, Self-Tracking entities) there is a lot room for optimizations. You can debug the SQL commands (which shouldn't differ between debug and release mode) using the ObjectSet<T>.ToTraceString() method. If you encounter a query that requires further optimization you can use some projections to give EF more information about what you trying to accomplish.

Example:

Product product = db.Products.SingleOrDefault(p => p.Id == 10);
// executes SELECT * FROM Products WHERE Id = 10

ProductDto dto = new ProductDto();
foreach (Category category in product.Categories)
// executes SELECT * FROM Categories WHERE ProductId = 10
{
dto.Categories.Add(new CategoryDto { Name = category.Name });
}

Could be replaced with:

var query = from p in db.Products
where p.Id == 10
select new
{
p.Name,
Categories = from c in p.Categories select c.Name
};
ProductDto dto = new ProductDto();
foreach (var categoryName in query.Single().Categories)
// Executes SELECT p.Id, c.Name FROM Products as p, Categories as c WHERE p.Id = 10 AND p.Id = c.ProductId
{
dto.Categories.Add(new CategoryDto { Name = categoryName });
}

I just typed that out of my head, so this isn't exactly how it would be executed, but EF actually does some nice optimizations if you tell it everything you know about the query (in this case, that we will need the category-names). But this isn't like eager-loading (db.Products.Include("Categories")) because projections can further reduce the amount of data to load.

Fastest Way of Inserting in Entity Framework

To your remark in the comments to your question:

"...SavingChanges (for each
record
)..."

That's the worst thing you can do! Calling SaveChanges() for each record slows bulk inserts extremely down. I would do a few simple tests which will very likely improve the performance:

  • Call SaveChanges() once after ALL records.
  • Call SaveChanges() after for example 100 records.
  • Call SaveChanges() after for example 100 records and dispose the context and create a new one.
  • Disable change detection

For bulk inserts I am working and experimenting with a pattern like this:

using (TransactionScope scope = new TransactionScope())
{
MyDbContext context = null;
try
{
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;

int count = 0;
foreach (var entityToInsert in someCollectionOfEntitiesToInsert)
{
++count;
context = AddToContext(context, entityToInsert, count, 100, true);
}

context.SaveChanges();
}
finally
{
if (context != null)
context.Dispose();
}

scope.Complete();
}

private MyDbContext AddToContext(MyDbContext context,
Entity entity, int count, int commitCount, bool recreateContext)
{
context.Set<Entity>().Add(entity);

if (count % commitCount == 0)
{
context.SaveChanges();
if (recreateContext)
{
context.Dispose();
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
}
}

return context;
}

I have a test program which inserts 560.000 entities (9 scalar properties, no navigation properties) into the DB. With this code it works in less than 3 minutes.

For the performance it is important to call SaveChanges() after "many" records ("many" around 100 or 1000). It also improves the performance to dispose the context after SaveChanges and create a new one. This clears the context from all entites, SaveChanges doesn't do that, the entities are still attached to the context in state Unchanged. It is the growing size of attached entities in the context what slows down the insertion step by step. So, it is helpful to clear it after some time.

Here are a few measurements for my 560000 entities:

  • commitCount = 1, recreateContext = false: many hours (That's your current procedure)
  • commitCount = 100, recreateContext = false: more than 20 minutes
  • commitCount = 1000, recreateContext = false: 242 sec
  • commitCount = 10000, recreateContext = false: 202 sec
  • commitCount = 100000, recreateContext = false: 199 sec
  • commitCount = 1000000, recreateContext = false: out of memory exception
  • commitCount = 1, recreateContext = true: more than 10 minutes
  • commitCount = 10, recreateContext = true: 241 sec
  • commitCount = 100, recreateContext = true: 164 sec
  • commitCount = 1000, recreateContext = true: 191 sec

The behaviour in the first test above is that the performance is very non-linear and decreases extremely over time. ("Many hours" is an estimation, I never finished this test, I stopped at 50.000 entities after 20 minutes.) This non-linear behaviour is not so significant in all other tests.



Related Topics



Leave a reply



Submit