SQL Bulk Import from CSV

Import CSV file into SQL Server

Based SQL Server CSV Import

1) The CSV file data may have , (comma) in between (Ex:
description), so how can I make import handling these data?

Solution

If you're using , (comma) as a delimiter, then there is no way to differentiate between a comma as a field terminator and a comma in your data. I would use a different FIELDTERMINATOR like ||. Code would look like and this will handle comma and single slash perfectly.

2) If the client create the csv from excel then the data that have
comma are enclosed within " ... " (double quotes) [as the below
example] so how do the import can handle this?

Solution

If you're using BULK insert then there is no way to handle double quotes, data will be
inserted with double quotes into rows.
after inserting the data into table you could replace those double quotes with ''.

update table
set columnhavingdoublequotes = replace(columnhavingdoublequotes,'"','')

3) How do we track if some rows have bad data, which import skips?
(does import skips rows that are not importable)?

Solution

To handle rows which aren't loaded into table because of invalid data or format, could be
handle using ERRORFILE property, specify the error file name, it will write the rows
having error to error file. code should look like.

BULK INSERT SchoolsTemp
FROM 'C:\CSVData\Schools.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
ERRORFILE = 'C:\CSVDATA\SchoolsErrorRows.csv',
TABLOCK
)

SQL Bulk import from CSV

I know this isn't a real solution but I use a dummy table for the import with nvarchar set for everything. Then I do an insert which strips out the " characters and does the conversions. It isn't pretty but it does the job.

Bulk insert csv file with semicolon as delimiter

The SQL Server import facilities are very intolerant of bad data and even just formatting variations or options. In my career, I have literally spent thousands of work-hours trying to develop and debug import procedures for customers. I can tell you right now, that trying to fix this with SQL alone is both difficult and time-consuming.

When you have this problem (bad data and/or inconsistent formatting) it is almost always easier to find or develop a more flexible tool to pre-process the data into the rigid standard that SQL expects. So I would say that if Excel can parse it then just use Excel automation to pre-process them and then use SQL to import the Excel output. If that's not practical for you, then I'd advise writing your own tool in some client language (C#, Vb, Java, Python, etc.) to pre-process the files.

You can do it in SQL (and I have done it many times), but I promise you that it is a long complicated trek.

SSIS has more flexible error-handling for problems like this, but if you are not already familiar and using it, it has a very steep learning curve and your first SSIS project is likely to be very time-consuming also.

BULK INSERT with CSV File in SQL Server

OK, hopefully all this helps. I took what you posted in the comment and turned it into a stored procedure. The scenario is that you have a flat file that has a consistent format which you'll load to the same table each time (dbo.custCompInfo_Tab). That table will be merged into the final destination (dbo.Daily_Sync) matching on CompanyName. I added a few data columns to illustrate the merge further.

SETUP:

CREATE TABLE dbo.Daily_Sync
(CompanyName VARCHAR(10)
, UserId INT
, col1 INT
, col2 INT
, col3 INT
)

CREATE TABLE dbo.custCompInfo_Tab
(CompanyName VARCHAR(10)
, col1 INT
, col2 INT
, col3 INT
)

I have two data files to load, TrialBalance.txt and TrialBalance2.txt. They contain the following data:

TrialBalance.txt

abc,1,2,3
def,4,5,6
qwe,7,8,9
asd,10,11,12
zxc,13,14,15

TrialBalance2.txt

abc,1,2,3
def,20,21,22
qwe,7,8,9
xcv,10,11,12
xbv,13,14,15

I created a stored procedure that truncates the staging table, loads the table with the data from the file path passed in and then merges it into the destination.

CREATE PROCEDURE dbo.loadDailyData
@FullFilePath NVARCHAR(MAX)
AS
BEGIN
DECLARE @sql NVARCHAR(MAX)
TRUNCATE TABLE dbo.custCompInfo_Tab

SET @sql = N'BULK INSERT dbo.custCompInfo_Tab FROM ''' + @FullFilePath
+ ''' WITH ( FIELDTERMINATOR ='','',ROWTERMINATOR = ''\n'',FIRSTROW = 1 )'

SELECT @sql

EXEC sp_executesql @sql

MERGE INTO dbo.Daily_Sync AS TGT
USING
(SELECT CompanyName
, USER_ID() usrid
, col1
, col2
, col3
FROM dbo.custCompInfo_Tab
) AS SRC
ON TGT.Companyname = SRC.CompanyName
WHEN MATCHED
THEN UPDATE
SET TGT.Companyname = SRC.companyname
, TGT.col1 = SRC.col1
, TGT.col2 = SRC.col2
, TGT.col3 = SRC.col3
WHEN NOT MATCHED
THEN INSERT (companyname
, UserId
, col1
, col2
, col3
)
VALUES (SRC.CompanyName
, SRC.usrid
, SRC.col1
, SRC.col2
, SRC.col3
);

END

There is dynamic sql here that is used to build the string and other than the rowcounts, the BULK INSERT string is all that is returned.

Finally, we can see the tables before and after:

SELECT  *
FROM dbo.custCompInfo_Tab
SELECT *
FROM dbo.Daily_Sync

EXEC dbo.loadDailyData @FullFilePath = 'D:\xmlData\TrialBalance.txt'

SELECT *
FROM dbo.custCompInfo_Tab
SELECT *
FROM dbo.Daily_Sync

EXEC dbo.loadDailyData @FullFilePath = 'D:\xmlData\TrialBalance2.txt'

SELECT *
FROM dbo.custCompInfo_Tab
SELECT *
FROM dbo.Daily_Sync

RESULTS:

Sample Image

Can't BULK INSERT Into SQL Table From CSV File

You need to use ROWTERMINATOR='0x0a'

Your code will become:

BULK INSERT dbo.t_process_order_import
FROM 'C:\Root\Product Data\H888 ProcOrd.csv'
WITH
(
FIRSTROW = 2, -- as 1st one is header
FIELDTERMINATOR = '|',
ROWTERMINATOR = '0x0a',
TABLOCK
)

As suggested, I try to improve with my source:
https://learn.microsoft.com/en-us/sql/relational-databases/import-export/specify-field-and-row-terminators-sql-server?view=sql-server-ver16

At paragraph "Specifying \n as a Row Terminator for Bulk Import"

Reporting here what is important for the question:

When you specify \n as a row terminator for bulk import, or implicitly use the default row terminator, bcp and the BULK INSERT statement expect a carriage return-line feed combination (CRLF) as the row terminator. If your source file uses a line feed character only (LF) as the row terminator - as is typical in files generated on Unix and Linux computers - use hexadecimal notation to specify the LF row terminator. For example, in a BULK INSERT statement

I want to bulk import from CSV file in sql but \n for new line is not working in SQL

I did it with the help of @DVO. Thank you @dvo for answer. It is working fine as per your instructions. i used notepad++ and see the hidden characters and handle them accordingly.



Related Topics



Leave a reply



Submit