Fastest Way of Inserting in Entity Framework
To your remark in the comments to your question:
"...SavingChanges (for each
record)..."
That's the worst thing you can do! Calling SaveChanges()
for each record slows bulk inserts extremely down. I would do a few simple tests which will very likely improve the performance:
- Call
SaveChanges()
once after ALL records. - Call
SaveChanges()
after for example 100 records. - Call
SaveChanges()
after for example 100 records and dispose the context and create a new one. - Disable change detection
For bulk inserts I am working and experimenting with a pattern like this:
using (TransactionScope scope = new TransactionScope())
{
MyDbContext context = null;
try
{
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
int count = 0;
foreach (var entityToInsert in someCollectionOfEntitiesToInsert)
{
++count;
context = AddToContext(context, entityToInsert, count, 100, true);
}
context.SaveChanges();
}
finally
{
if (context != null)
context.Dispose();
}
scope.Complete();
}
private MyDbContext AddToContext(MyDbContext context,
Entity entity, int count, int commitCount, bool recreateContext)
{
context.Set<Entity>().Add(entity);
if (count % commitCount == 0)
{
context.SaveChanges();
if (recreateContext)
{
context.Dispose();
context = new MyDbContext();
context.Configuration.AutoDetectChangesEnabled = false;
}
}
return context;
}
I have a test program which inserts 560.000 entities (9 scalar properties, no navigation properties) into the DB. With this code it works in less than 3 minutes.
For the performance it is important to call SaveChanges()
after "many" records ("many" around 100 or 1000). It also improves the performance to dispose the context after SaveChanges and create a new one. This clears the context from all entites, SaveChanges
doesn't do that, the entities are still attached to the context in state Unchanged
. It is the growing size of attached entities in the context what slows down the insertion step by step. So, it is helpful to clear it after some time.
Here are a few measurements for my 560000 entities:
- commitCount = 1, recreateContext = false: many hours (That's your current procedure)
- commitCount = 100, recreateContext = false: more than 20 minutes
- commitCount = 1000, recreateContext = false: 242 sec
- commitCount = 10000, recreateContext = false: 202 sec
- commitCount = 100000, recreateContext = false: 199 sec
- commitCount = 1000000, recreateContext = false: out of memory exception
- commitCount = 1, recreateContext = true: more than 10 minutes
- commitCount = 10, recreateContext = true: 241 sec
- commitCount = 100, recreateContext = true: 164 sec
- commitCount = 1000, recreateContext = true: 191 sec
The behaviour in the first test above is that the performance is very non-linear and decreases extremely over time. ("Many hours" is an estimation, I never finished this test, I stopped at 50.000 entities after 20 minutes.) This non-linear behaviour is not so significant in all other tests.
Bulk Insert with Entity Framework 6
You can use the following library:
https://github.com/MikaelEliasson/EntityFramework.Utilities
It works well for simple bulk inserts and updates.
You should also look at the following post if you want to find out about other options to achieve bulk insert:
Fastest Way of Inserting in Entity Framework
How to increase insert speed using Bulk insert using AddRange and then SaveChanges in Entity Framework
It doesn't matter if you use Add in a foreach or AddRange, problem lies in SaveChanges method, as it stores changes in observed entities one by one I think. There are libraries out there that allows for real bulk insert of entities using under the hood mechanism of SqlBulkCopy
Link to EF Core library: EFCore.BulkExtensions
EDIT:
For EF6 I found this nuget: EntityFramework6.BulkInsert but I haven't personally used it so I can't say anything about it.
EDIT 2: I simplified this, using AddRange over Add will improve time of adding entities to change tracker, but still SaveChanges will could take very long time, so it's not a solution.
Improving bulk insert performance in Entity framework
There is opportunity for several improvements (if you are using DbContext
):
Set:
yourContext.Configuration.AutoDetectChangesEnabled = false;
yourContext.Configuration.ValidateOnSaveEnabled = false;
Do SaveChanges()
in packages of 100 inserts... or you can try with packages of 1000 items and see the changes in performance.
Since during all this inserts, the context is the same and it is getting bigger, you can rebuild your context object every 1000 inserts. var yourContext = new YourContext();
I think this is the big gain.
Doing this improvements in an importing data process of mine, took it from 7 minutes to 6 seconds.
The actual numbers... could not be 100 or 1000 in your case... try it and tweak it.
Insert huge number of rows into database using Entity Framework
When I add my seeding method to Configuration.cs
and run update-database
command it takes less than 5 minutes to insert all rows.
It works best when calling Context.AddRange()
only once.
dbContext.Configuration.AutoDetectChangesEnabled = false;
dbContext.Configuration.ValidateOnSaveEnabled = false;
dbContext.ReportData.AddRange(recordsList);
dbContext.SaveChanges();
Entity Framework insertion performance
All common tricks like:
- AutoDetectChangesEnabled = false
- Use AddRange over Add
- Etc.
Will not work like you already have noticed since the performance problem is not within Entity Framework but with SQL Azure
SQL Azure may look pretty cool at first but it's slow as hell unless you paid for a very good Premium Database Tier.
As Evk recommended, you should try to execute a simple SQL Command like "SELECT 1" and you will notice this probably take more than 100ms which is ridiculously slow.
Solution:
- Move to a better SQL Azure Tier
- Move away from SQL Azure
Disclaimer: I'm the owner of the project Entity Framework Extensions
Another solution is using this library which will batch multiple queries/bulk operations. However again, even if this library is very fast, you will need a better SQL Azure Tier since it look every database round-trip take more than 200ms in your case.
Fastest way of inserting many parent and child records
Disclaimer: I'm the owner of the project Entity Framework Extensions
Here is the fastest way of inserting, updating, deleting, and merging. You can even make it easier and use BulkSaveChanges over SaveChanges.
// Using BulkSaveChanges
using (var db = new MyDBContext())
{
db.ScenarioCategory.AddRange(categories);
db.BulkSaveChanges();
}
// Using BulkInsert on parent then child
using (var db = new MyDBContext())
{
db.BulkInsert(categories);
db.BulkInsert(categories.SelectMany(x => x.Items);
}
Related Topics
Sending Email in .Net Through Gmail
Collection Was Modified; Enumeration Operation May Not Execute
Find All Controls in Wpf Window by Type
How to Create a Dropdownlist from an Enum in ASP.NET MVC
How to Execute a Stored Procedure Within C# Program
Capturing Console Output from a .Net Application (C#)
How to Prevent the App from Terminating When I Close the Startup Form
C# Httpclient 4.5 Multipart/Form-Data Upload
Difference Between a Field and a Property
Converting a String to Datetime
How to Get a Consistent Byte Representation of Strings in C# Without Manually Specifying an Encoding
How to Recursively List All the Files in a Directory in C#