SQL Server Bulk Insert CSV with Data Having Comma

sql server Bulk insert csv with data having comma

The answer is: you can't do that. See http://technet.microsoft.com/en-us/library/ms188365.aspx.

"Importing Data from a CSV file

Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. For information about the requirements for importing data from a CSV data file, see Prepare Data for Bulk Export or Import (SQL Server)."

The general solution is that you must convert your CSV file into one that can be be successfully imported. You can do that in many ways, such as by creating the file with a different delimiter (such as TAB) or by importing your table using a tool that understands CSV files (such as Excel or many scripting languages) and exporting it with a unique delimiter (such as TAB), from which you can then BULK INSERT.

Import csv data in SQL, but cell data contains comma, how do I handle It in SQL

It's difficult without test data, but I'm able to reproduce the issue from the question and a possible solution is the FORMAT='CSV' option, available from SQL Server 2017:

CSV file:

Address, Customer
"City name, Country", Customer Name

Statement:

BULK INSERT #TempTable
FROM 'e:\filetesting.csv'
WITH
(
FIRSTROW = 2,
FORMAT = 'CSV',
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
TABLOCK
)

Result:

Address              Customer
----------------------------------
City name, Country Customer Name

Commas within CSV Data

If there is a comma in a column then that column should be surrounded by a single quote or double quote. Then if inside that column there is a single or double quote it should have an escape charter before it, usually a \

Example format of CSV


ID - address - name
1, "Some Address, Some Street, 10452", 'David O\'Brian'

Bulk Insert Partially Quoted CSV File in SQL Server

Unfortunately SQL Server interprets the quoted comma as a delimiter. This applies to both BCP and bulk insert .

From http://msdn.microsoft.com/en-us/library/ms191485%28v=sql.100%29.aspx

If a terminator character occurs within the data, it is interpreted as
a terminator, not as data, and the data after that character is
interpreted as belonging to the next field or record. Therefore,
choose your terminators carefully to make sure that they never appear
in your data.

Bulk insert csv - column value itself has commas

If I had this particular issue and the creation of the CSV files was not under my control, I would resort to a Perl script like this:

open(my $fhin, "<", "MyFile.csv");
open(my $fhout, ">", "MyQFile.csv");

while (my $line = <$fh>) {
chomp($line);
$line =~ s/^([^,]*),([^,]*),(.*),([^,]*)$/\"$1\",\"$2\",\"$3\",\"$4\"/;

print $fhout $line . "\n";
}

Note that the above regular expression can handle only one "problem" column of this kind. If there are any others, there is no possibility of programmatically assigning correct quotation to such columns (without more information...).

Bulk Insert issues with CSV file

It seems you have ',' as data into your source csv file. Similar question is answered at below link -

sql server Bulk insert csv with data having comma

Comma's causing a problem using BULK INSERT and a Format File

You've done it right as far as I can see - taking knowledge from here:

http://www.sqlservercentral.com/Forums/Topic18289-8-1.aspx#bm87418

Essentially, changing the separator to "\",\"" should be enough as the comma mid the last field is not "," but ,

Try setting the first and last separators as in the link ("\",\"") and ("\"\r") and see if that helps?

Or, preprocess the files and replace "," with some junk like ##$## and replace , with . (or some other character) and then ##$## with "," and then import? Unless the , is vital in the last field, a dot usually does the trick.

Bulk Insert CSV into SQL Server 2012 - exclude few comma data starts with double quote

Regarding your problem, from the searching I've been doing on this, I believe that speech marks "" are a problem for imports.

You might need to write a script that processes your data and removes speech marks or extra commas before you do the import to your database.

SQL Server Bulk insert of CSV file with inconsistent quotes

You are going to need to preprocess the file, period.

If you really really need to do this, here is the code. I wrote this because I absolutely had no choice. It is utility code and I'm not proud of it, but it works. The approach is not to get SQL to understand quoted fields, but instead manipulate the file to use an entirely different delimiter.

EDIT: Here is the code in a github repo. It's been improved and now comes with unit tests! https://github.com/chrisclark/Redelim-it

This function takes an input file and will replace all field-delimiting commas (NOT commas inside quoted-text fields, just the actual delimiting ones) with a new delimiter. You can then tell sql server to use the new field delimiter instead of a comma. In the version of the function here, the placeholder is <TMP> (I feel confident this will not appear in the original csv - if it does, brace for explosions).

Therefore after running this function you import in sql by doing something like:

BULK INSERT MyTable
FROM 'C:\FileCreatedFromThisFunction.csv'
WITH
(
FIELDTERMINATOR = '<*TMP*>',
ROWTERMINATOR = '\n'
)

And without further ado, the terrible, awful function that I apologize in advance for inflicting on you (edit - I've posted a working program that does this instead of just the function on my blog here):

Private Function CsvToOtherDelimiter(ByVal InputFile As String, ByVal OutputFile As String) As Integer

Dim PH1 As String = "<*TMP*>"

Dim objReader As StreamReader = Nothing
Dim count As Integer = 0 'This will also serve as a primary key'
Dim sb As New System.Text.StringBuilder

Try
objReader = New StreamReader(File.OpenRead(InputFile), System.Text.Encoding.Default)
Catch ex As Exception
UpdateStatus(ex.Message)
End Try

If objReader Is Nothing Then
UpdateStatus("Invalid file: " & InputFile)
count = -1
Exit Function
End If

'grab the first line
Dim line = reader.ReadLine()
'and advance to the next line b/c the first line is column headings
If hasHeaders Then
line = Trim(reader.ReadLine)
End If

While Not String.IsNullOrEmpty(line) 'loop through each line

count += 1

'Replace commas with our custom-made delimiter
line = line.Replace(",", ph1)

'Find a quoted part of the line, which could legitimately contain commas.
'In that case we will need to identify the quoted section and swap commas back in for our custom placeholder.
Dim starti = line.IndexOf(ph1 & """", 0)
If line.IndexOf("""",0) = 0 then starti=0

While starti > -1 'loop through quoted fields

Dim FieldTerminatorFound As Boolean = False

'Find end quote token (originally a ",)
Dim endi As Integer = line.IndexOf("""" & ph1, starti)

If endi < 0 Then
FieldTerminatorFound = True
If endi < 0 Then endi = line.Length - 1
End If

While Not FieldTerminatorFound

'Find any more quotes that are part of that sequence, if any
Dim backChar As String = """" 'thats one quote
Dim quoteCount = 0
While backChar = """"
quoteCount += 1
backChar = line.Chars(endi - quoteCount)
End While

If quoteCount Mod 2 = 1 Then 'odd number of quotes. real field terminator
FieldTerminatorFound = True
Else 'keep looking
endi = line.IndexOf("""" & ph1, endi + 1)
End If
End While

'Grab the quoted field from the line, now that we have the start and ending indices
Dim source = line.Substring(starti + ph1.Length, endi - starti - ph1.Length + 1)

'And swap the commas back in
line = line.Replace(source, source.Replace(ph1, ","))

'Find the next quoted field
' If endi >= line.Length - 1 Then endi = line.Length 'During the swap, the length of line shrinks so an endi value at the end of the line will fail
starti = line.IndexOf(ph1 & """", starti + ph1.Length)

End While

line = objReader.ReadLine

End While

objReader.Close()

SaveTextToFile(sb.ToString, OutputFile)

Return count

End Function


Related Topics



Leave a reply



Submit