Libraries for Ado.Net to Rapidly Bulk Insert Data into a Database from a .CSV File

Libraries for ADO.NET to rapidly bulk insert data into a database from a .csv file?

The ADO.net SqlBulkCopy class enables quick, mass upload of records into a table:

    DataTable dt = s_EmptyUploadTable.Copy();
foreach (var itm in yourList) {
DataRow row = dt.NewRow();
row["Field1"] = itm.Field1;
row["Field2"] = itm.Field2;
dt.Rows.Add(row);
}
using (SqlConnection cn = new SqlConnection(yourConnectionString)) {
cn.Open();
using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn)) {
bulkCopy.DestinationTableName = "dbo.YourActualSQLServerTableName";
bulkCopy.WriteToServer(dt);
}
cn.Close();
}

Upload CSV file to SQL server

1st off, You don't need programming stuff. You can directly upload CSV files into SQL Database with SQL management tools. However, if you really need do it through programming, Just read below.

Personally, I think this approach is the most efficient and easiest way to do through programming.

In general, you can achieve it in two steps

1st step is to read the CSV file and hold the records as a DataTable.

2nd step is store the retrieved DataTable into SQL Database Table as a Bulk Entry

This is a function that returns CSV File Data as a DataTable. Call and Keep it in the memory and you can do whatever you want with it.

This function is going to return CSV Read file into DataTable.

private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datecolumn = new DataColumn(column);
datecolumn.AllowDBNull = true;
csvData.Columns.Add(datecolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
return null;
}
return csvData;
}
}

SQLBulkCopy - Use this function to insert the Retrieved DataTable into Sql Table

static void InsertDataIntoSQLServerUsingSQLBulkCopy(DataTable csvFileData)
{
using(SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=yourDB;Integrated Security=SSPI;"))
{
dbConnection.Open();
using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
{
s.DestinationTableName = "Your table name";
foreach (var column in csvFileData.Columns)
s.ColumnMappings.Add(column.ToString(), column.ToString());
s.WriteToServer(csvFileData);
}
}

Source

How to read a CSV file into a .NET Datatable

Here's an excellent class that will copy CSV data into a datatable using the structure of the data to create the DataTable:

A portable and efficient generic parser for flat files

It's easy to configure and easy to use. I urge you to take a look.

c# code performance importing and analyzing csv files

Consider uploading the data to staging tables like so and performing the validation using set logic rather than row logic. This avoids having to parse into objects at all and should both shift the CPU overhead to a different server (if your application server is underpowered) and reduce the CPU strain required.

SQL Server Bulk insert of CSV file with inconsistent quotes

You are going to need to preprocess the file, period.

If you really really need to do this, here is the code. I wrote this because I absolutely had no choice. It is utility code and I'm not proud of it, but it works. The approach is not to get SQL to understand quoted fields, but instead manipulate the file to use an entirely different delimiter.

EDIT: Here is the code in a github repo. It's been improved and now comes with unit tests! https://github.com/chrisclark/Redelim-it

This function takes an input file and will replace all field-delimiting commas (NOT commas inside quoted-text fields, just the actual delimiting ones) with a new delimiter. You can then tell sql server to use the new field delimiter instead of a comma. In the version of the function here, the placeholder is <TMP> (I feel confident this will not appear in the original csv - if it does, brace for explosions).

Therefore after running this function you import in sql by doing something like:

BULK INSERT MyTable
FROM 'C:\FileCreatedFromThisFunction.csv'
WITH
(
FIELDTERMINATOR = '<*TMP*>',
ROWTERMINATOR = '\n'
)

And without further ado, the terrible, awful function that I apologize in advance for inflicting on you (edit - I've posted a working program that does this instead of just the function on my blog here):

Private Function CsvToOtherDelimiter(ByVal InputFile As String, ByVal OutputFile As String) As Integer

Dim PH1 As String = "<*TMP*>"

Dim objReader As StreamReader = Nothing
Dim count As Integer = 0 'This will also serve as a primary key'
Dim sb As New System.Text.StringBuilder

Try
objReader = New StreamReader(File.OpenRead(InputFile), System.Text.Encoding.Default)
Catch ex As Exception
UpdateStatus(ex.Message)
End Try

If objReader Is Nothing Then
UpdateStatus("Invalid file: " & InputFile)
count = -1
Exit Function
End If

'grab the first line
Dim line = reader.ReadLine()
'and advance to the next line b/c the first line is column headings
If hasHeaders Then
line = Trim(reader.ReadLine)
End If

While Not String.IsNullOrEmpty(line) 'loop through each line

count += 1

'Replace commas with our custom-made delimiter
line = line.Replace(",", ph1)

'Find a quoted part of the line, which could legitimately contain commas.
'In that case we will need to identify the quoted section and swap commas back in for our custom placeholder.
Dim starti = line.IndexOf(ph1 & """", 0)
If line.IndexOf("""",0) = 0 then starti=0

While starti > -1 'loop through quoted fields

Dim FieldTerminatorFound As Boolean = False

'Find end quote token (originally a ",)
Dim endi As Integer = line.IndexOf("""" & ph1, starti)

If endi < 0 Then
FieldTerminatorFound = True
If endi < 0 Then endi = line.Length - 1
End If

While Not FieldTerminatorFound

'Find any more quotes that are part of that sequence, if any
Dim backChar As String = """" 'thats one quote
Dim quoteCount = 0
While backChar = """"
quoteCount += 1
backChar = line.Chars(endi - quoteCount)
End While

If quoteCount Mod 2 = 1 Then 'odd number of quotes. real field terminator
FieldTerminatorFound = True
Else 'keep looking
endi = line.IndexOf("""" & ph1, endi + 1)
End If
End While

'Grab the quoted field from the line, now that we have the start and ending indices
Dim source = line.Substring(starti + ph1.Length, endi - starti - ph1.Length + 1)

'And swap the commas back in
line = line.Replace(source, source.Replace(ph1, ","))

'Find the next quoted field
' If endi >= line.Length - 1 Then endi = line.Length 'During the swap, the length of line shrinks so an endi value at the end of the line will fail
starti = line.IndexOf(ph1 & """", starti + ph1.Length)

End While

line = objReader.ReadLine

End While

objReader.Close()

SaveTextToFile(sb.ToString, OutputFile)

Return count

End Function


Related Topics



Leave a reply



Submit