Insert 2 Million Rows into SQL Server Quickly

Insert 2 million rows into SQL Server quickly

You can try with SqlBulkCopy class.

Lets you efficiently bulk load a SQL Server table with data from
another source.

There is a cool blog post about how you can use it.

How to insert 2 million rows into two tables in sqlserver in less than 10 seconds

The fastest way to insert large amounts of rows is to use bulk insertion (SqlBulkCopy or other APIs). I can see that you are using MERGE. That cannot be used with bulk copy so this design would force table valued parameters like you are using them right now. TVPs are a bit slower in terms of more CPU usage. You can also try bulk inserting to temp tables and then using MERGE. It is my understanding that a TVP physically is a temp table anyway. There is no true streaming going on. All the data that you stream into it in your C# code is simply inserted by the server into an automatically managed table.

The TVP streaming (SqlMetaData) you did is correct. It is the fastest way to transmit TVP data in my experience.

You will need to parallelize. Empirically, it is hard to exceed 100k rows per second under optimal conditions for fairly simple rows. At that point the CPU is saturated on one core. You can insert in parallel on multiple cores under certain conditions which are documented. There are requirements for the index structure. You also might encounter locking issues. A sure way to solve those is to insert into independent tables or partitions. But of course this will force you to change the other queries that are run against these tables.

If you must perform complex logic when inserting you could still insert into fresh tables and then perform the logic when querying. This is more work and error-prone but it might allow you to meet your latency requirement.

I hope these ideas help you get on the right path. Feel free to comment.

SQL Server - Insert 2M+ records in SQL script with 7000 rows per insert

To properly answer your question, exactly as it is asked, no there is not a way natively within the SQL interface to overcome the INSERT limitation. You will need to create a programatic solution. I have listed some technoligies in my comment such as Python, PowerShell, and .NET. You could paste together a solution with BCP, BULK INSERT, SSIS, or some other BI tool. Here are some links for C# that talk about bulk insert a large dataset:

Insert 2 million rows into SQL Server quickly

Bulk inserting with SQL Server (C#)

Also there was a similar question asked and the accepted answer here suggests using SQL Server Import Wizard:

Import Wizard - Bulk Insert

Fastest way to insert 1 million rows in SQL Server

I think what you are looking for is Bulk Insert if you prefer using SQL.

Or there is also the ADO.NET for Batch Operations option, so you keep the logic in your C# application. This article is also very complete.

Update

Yes I'm afraid bulk insert will only work with imported files (from within the database).

I have an experience in a Java project where we needed to insert millions of rows (data came from outside the application btw).

Database was Oracle, so of course we used the multi-line insert of Oracle. It turned out that the Java batch update was much faster than the multi-valued insert of Oracle (so called "bulk updates").

My suggestion is:

  • Compare the performance between the multi-value insert of SQL Server code (then you can read from inside your database, a procedure if you like) with the ADO.NET Batch Insert.

If the data you are going to manipulate is coming from outside your application (if it is not already in the database), I would say just go for the ADO.NET Batch Inserts. I think that its your case.

Note: Keep in mind that batch inserts usually operate with the same query. That is what makes them so fast.



Related Topics



Leave a reply



Submit