How to Do a Bulk Insert -- Linq to Entities

How to do a Bulk Insert -- Linq to Entities

Sometimes you simply have to mix models. Perhaps use SqlBulkCopy for this part of your repository (since this plugs directly into the bulk-copy API), and Entity Framework for some of the rest. And if necessary, a bit of direct ADO.NET. Ultimately the goal is to get the job done.

Efficient way to do bulk insert/update with Entity Framework

Just don't use Entity Framework in this case. Just use a stored procedure (how to depends on the version/approach you use with EF, you might will have to extend your DbContext or add a mapping from the entity model).

If you're using SQL Server, then in your store procedure, do use the MERGE command that efficiently does exactly what you need: insert if it doesn't exist, or update if it does. Everything in a single, efficient SQL query.

Bulk Insert with Linq to Sql (vb.net)

If this works at all like LINQ to Entities, the slowness here is actually from creating the change tracking objects for each record.

In that case, you can use DbContext.Configuration.AutoDetectChangesEnabled =
False
to disable change tracking. Change detection is deferred until the loop is complete and the entire context is processed.

See this article for more information on the EF side:
DbContext AutoDetectChangesEnabled set to false detecting changes

As for your problem with LINQ to SQL, you might look into the DataContext.ObjectTrackingEnabled property as outlined in this article:
http://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.objecttrackingenabled(v=vs.90).aspx

Perhaps setting this property to false before your loop and back to true afterwards (but before SubmitChanges) might help performance.

C# & EF: bulkinsert related objects

EntityFramework.BulkInsert is a very good library which supports simple scenario. However, the library is limited and not anymore supported.

So far, there are only one good workaround and it's using a library which supports everything!

Disclaimer: I'm the owner of the project Entity Framework Extensions

This library supports everything including all associations and inheritance.

By example, for saving multiple entities in different tables, you can use BulkSaveChanges which work exactly like SaveChanges but way faster!

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

The library also do more than only inserting. It support all bulk operations:

  • BulkInsert
  • BulkUpdate
  • BulkDelete
  • BulkMerge

However unlike EntityFramework.BulkInsert, this library is not free.

EDIT: Answer subquestion

You say way faster - do you have any metrics or a link to metrics

@Mark: You can look on metrics on our website homepage. We report BulkSaveChanges to be at least 15x faster than SaveChanges.

However metric are heavily biased. Too many things may affect it like index, trigger, latency etc!

People usually report us performance improved by 25x, 50x, 80x!

One thing people usually forget when performing benchmarking is calling our library once before the test for JIT compilation! Like Entity Framework, the first hit to the library may take several ms.

Entity Framework, Bulk Inserts, and Maintaining Relationships

I had bad experience with huge context save. All those recommendations about saving in iterations by 100 rows, by 1000 rows, then disposing context or clearing list and detaching objects, assigning null to everything etc etc - it is all bullshit. We had requirements to insert daily millions of rows in many tables. Definitely one should not use entity in these conditions. You will be fighting with memory leaks and decrease in insertion speed when iterations proceed.

Our first improvement was creating stored procedures and adding them to model. It is 100 times faster then Context.SaveChanges(), and there is no leaks, no decrease in speed over time.

But it was not sufficient for us and we decided to use SqlBulkCopy. It is super fast. 1000 times faster then using stored procedures.

So my suggestion will be:
if you have many rows to insert but count is under something like 50000 rows, use stored procedures, imported in model;
if you have hundreds of thousands of rows, go and try SqlBulkCopy.

Here is some code:

EntityConnection ec = (EntityConnection)Context.Connection;
SqlConnection sc = (SqlConnection)ec.StoreConnection;

var copy = new SqlBulkCopy(sc, SqlBulkCopyOptions.CheckConstraints | SqlBulkCopyOptions.Default , null);

copy.DestinationTableName = "TableName";
copy.ColumnMappings.Add("SourceColumn", "DBColumn");
copy.WriteToServer(dataTable);
copy.Close();

If you use DbTransaction with context, you can manage to bulk insert using that transaction as well, but it needs some hacks.

bulk insert and update with ADO.NET Entity Framework

You can do multiple inserts this way.

I've seen the exception you're getting in cases where the model (EDMX) is not set up correctly. You either don't have a primary key (EntityKey in EF terms) on that table, or the designer has tried to guess what the EntityKey should be. In the latter case, you'll see two or more properties in the EDM Designer with keys next to them.

Make sure the ImportDoorAccess table has a single primary key and refresh the model.



Related Topics



Leave a reply



Submit