Import CSV file into SQL Server
Based SQL Server CSV Import
1) The CSV file data may have
,
(comma) in between (Ex:
description), so how can I make import handling these data?
Solution
If you're using ,
(comma) as a delimiter, then there is no way to differentiate between a comma as a field terminator and a comma in your data. I would use a different FIELDTERMINATOR
like ||
. Code would look like and this will handle comma and single slash perfectly.
2) If the client create the csv from excel then the data that have
comma are enclosed within" ... "
(double quotes) [as the below
example] so how do the import can handle this?
Solution
If you're using BULK insert then there is no way to handle double quotes, data will be
inserted with double quotes into rows.
after inserting the data into table you could replace those double quotes with ''.
update table
set columnhavingdoublequotes = replace(columnhavingdoublequotes,'"','')
3) How do we track if some rows have bad data, which import skips?
(does import skips rows that are not importable)?
Solution
To handle rows which aren't loaded into table because of invalid data or format, could be
handle using ERRORFILE property, specify the error file name, it will write the rows
having error to error file. code should look like.
BULK INSERT SchoolsTemp
FROM 'C:\CSVData\Schools.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
ERRORFILE = 'C:\CSVDATA\SchoolsErrorRows.csv',
TABLOCK
)
SQL Bulk import from CSV
I know this isn't a real solution but I use a dummy table for the import with nvarchar set for everything. Then I do an insert which strips out the " characters and does the conversions. It isn't pretty but it does the job.
Bulk insert csv file with semicolon as delimiter
The SQL Server import facilities are very intolerant of bad data and even just formatting variations or options. In my career, I have literally spent thousands of work-hours trying to develop and debug import procedures for customers. I can tell you right now, that trying to fix this with SQL alone is both difficult and time-consuming.
When you have this problem (bad data and/or inconsistent formatting) it is almost always easier to find or develop a more flexible tool to pre-process the data into the rigid standard that SQL expects. So I would say that if Excel can parse it then just use Excel automation to pre-process them and then use SQL to import the Excel output. If that's not practical for you, then I'd advise writing your own tool in some client language (C#, Vb, Java, Python, etc.) to pre-process the files.
You can do it in SQL (and I have done it many times), but I promise you that it is a long complicated trek.
SSIS has more flexible error-handling for problems like this, but if you are not already familiar and using it, it has a very steep learning curve and your first SSIS project is likely to be very time-consuming also.
BULK INSERT with CSV File in SQL Server
OK, hopefully all this helps. I took what you posted in the comment and turned it into a stored procedure. The scenario is that you have a flat file that has a consistent format which you'll load to the same table each time (dbo.custCompInfo_Tab
). That table will be merged into the final destination (dbo.Daily_Sync
) matching on CompanyName
. I added a few data columns to illustrate the merge further.
SETUP:
CREATE TABLE dbo.Daily_Sync
(CompanyName VARCHAR(10)
, UserId INT
, col1 INT
, col2 INT
, col3 INT
)
CREATE TABLE dbo.custCompInfo_Tab
(CompanyName VARCHAR(10)
, col1 INT
, col2 INT
, col3 INT
)
I have two data files to load, TrialBalance.txt and TrialBalance2.txt. They contain the following data:
TrialBalance.txt
abc,1,2,3
def,4,5,6
qwe,7,8,9
asd,10,11,12
zxc,13,14,15
TrialBalance2.txt
abc,1,2,3
def,20,21,22
qwe,7,8,9
xcv,10,11,12
xbv,13,14,15
I created a stored procedure that truncates the staging table, loads the table with the data from the file path passed in and then merges it into the destination.
CREATE PROCEDURE dbo.loadDailyData
@FullFilePath NVARCHAR(MAX)
AS
BEGIN
DECLARE @sql NVARCHAR(MAX)
TRUNCATE TABLE dbo.custCompInfo_Tab
SET @sql = N'BULK INSERT dbo.custCompInfo_Tab FROM ''' + @FullFilePath
+ ''' WITH ( FIELDTERMINATOR ='','',ROWTERMINATOR = ''\n'',FIRSTROW = 1 )'
SELECT @sql
EXEC sp_executesql @sql
MERGE INTO dbo.Daily_Sync AS TGT
USING
(SELECT CompanyName
, USER_ID() usrid
, col1
, col2
, col3
FROM dbo.custCompInfo_Tab
) AS SRC
ON TGT.Companyname = SRC.CompanyName
WHEN MATCHED
THEN UPDATE
SET TGT.Companyname = SRC.companyname
, TGT.col1 = SRC.col1
, TGT.col2 = SRC.col2
, TGT.col3 = SRC.col3
WHEN NOT MATCHED
THEN INSERT (companyname
, UserId
, col1
, col2
, col3
)
VALUES (SRC.CompanyName
, SRC.usrid
, SRC.col1
, SRC.col2
, SRC.col3
);
END
There is dynamic sql here that is used to build the string and other than the rowcounts, the BULK INSERT
string is all that is returned.
Finally, we can see the tables before and after:
SELECT *
FROM dbo.custCompInfo_Tab
SELECT *
FROM dbo.Daily_Sync
EXEC dbo.loadDailyData @FullFilePath = 'D:\xmlData\TrialBalance.txt'
SELECT *
FROM dbo.custCompInfo_Tab
SELECT *
FROM dbo.Daily_Sync
EXEC dbo.loadDailyData @FullFilePath = 'D:\xmlData\TrialBalance2.txt'
SELECT *
FROM dbo.custCompInfo_Tab
SELECT *
FROM dbo.Daily_Sync
RESULTS:
Can't BULK INSERT Into SQL Table From CSV File
You need to use ROWTERMINATOR='0x0a'
Your code will become:
BULK INSERT dbo.t_process_order_import
FROM 'C:\Root\Product Data\H888 ProcOrd.csv'
WITH
(
FIRSTROW = 2, -- as 1st one is header
FIELDTERMINATOR = '|',
ROWTERMINATOR = '0x0a',
TABLOCK
)
As suggested, I try to improve with my source:
https://learn.microsoft.com/en-us/sql/relational-databases/import-export/specify-field-and-row-terminators-sql-server?view=sql-server-ver16
At paragraph "Specifying \n as a Row Terminator for Bulk Import"
Reporting here what is important for the question:
When you specify \n as a row terminator for bulk import, or implicitly use the default row terminator, bcp and the BULK INSERT statement expect a carriage return-line feed combination (CRLF) as the row terminator. If your source file uses a line feed character only (LF) as the row terminator - as is typical in files generated on Unix and Linux computers - use hexadecimal notation to specify the LF row terminator. For example, in a BULK INSERT statement
I want to bulk import from CSV file in sql but \n for new line is not working in SQL
I did it with the help of @DVO. Thank you @dvo for answer. It is working fine as per your instructions. i used notepad++ and see the hidden characters and handle them accordingly.
Related Topics
Hql: How to Perform an Inner Join on a Subquery
Primary Key/Foreign Key Naming Convention
SQL Script to Alter All Foreign Keys to Add on Delete Cascade
Converting String List into Int List in SQL
Splitting Comma Separated Values in Columns to Multiple Rows in SQL Server
Getting "Comma-Separated List Near 'Xx.Yy' Invalid" with Dbms_Utility.Comma_To_Table
The Difference Between 'And' and '&&' in SQL
SQL Add Filter Only If a Variable Is Not Null
Converting a String to Hex in SQL
How to Combine Aggregate Functions in MySQL
How to Represent a Data Tree in SQL
Export Table Data from One SQL Server to Another
Update Multiple Rows with One Query