Special characters displaying incorrectly after BULK INSERT
You need to BULK INSERT
using the CODEPAGE = 'ACP'
, which converts string data from Windows codepage 1252 to SQL Server codepage.
BULK INSERT dbo.temp FROM 'C:\Temp\file.csv'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', CODEPAGE = 'ACP');
If you are bringing in UTF-8 data on a new enough version of SQL Server:
[...] , CODEPAGE = '65001');
You may also need to specify DATAFILETYPE = 'char|native|widechar|widenative'
.
Accented characters not correctly imported with BULK INSERT
It was answered in the comment. Did you try it?
http://msdn.microsoft.com/en-us/library/ms189941.aspx
Option DATAFILETYPE ='widenative'
Based on comment from Esailiga did the text get truncated before or after the bulk import. I agree it sounds like the CSV file itself is single byte. Unicode requires option DATAFILETYPE ='widenative'. If the CSV file is single byte the is not magic translation back.
What is too bad is é is extended ASCII and supported with SQL char so more evidence the problem is at the CSV.
SELECT CAST('é' AS char(1))
notice this works as extended ASCII (<255)
Sounds like you need to go back to the source.
The ? in SQL is unknown. Same as � in notepad.
How to write UTF-8 characters using bulk insert in SQL Server?
You can't. You should first use a N type data field, convert your file to UTF-16 and then import it. The database does not support UTF-8.
Bulk insert csv - column value itself has commas
If I had this particular issue and the creation of the CSV files was not under my control, I would resort to a Perl script like this:
open(my $fhin, "<", "MyFile.csv");
open(my $fhout, ">", "MyQFile.csv");
while (my $line = <$fh>) {
chomp($line);
$line =~ s/^([^,]*),([^,]*),(.*),([^,]*)$/\"$1\",\"$2\",\"$3\",\"$4\"/;
print $fhout $line . "\n";
}
Note that the above regular expression can handle only one "problem" column of this kind. If there are any others, there is no possibility of programmatically assigning correct quotation to such columns (without more information...).
BULK INSERT command is silently exiting without any error/exception after processing 315 batches in MSSQL
- The issue is related to the database cursor object.
- The root cause behind this is database cursor object is getting full and I had to do pagination like traversal to the cursor object.
- Because cursor was I was not able to capture the latest output of my stored procedure. In my case my cursor is getting full after 315 (return statements from stored procedure. Example output of insert statement like
10 rows inserted
).
Sample python code
import pyodbc
conn = pyodbc.connect(username, password)
query = "a stored procedure"
# Execute query and get cursor object
cursor = conn.execute(query)
# Clearing cursor
while cursor.nextset():
result = cursor.fetchall() if cursor.description else None
How to properly convert accented characters using FOR XML
I don't know for sure, but I'd assume, that this is not possible without a hack.
SQL-Server treats XML as 2 byte utf-16
internally. So all characters, which are displayable within this range do not need escaping.
It is a different thing, when it comes to unprintable characters or letters, which have a declarative meaning, like <, > and &
or quotes (in some cases).
I'd probably use a cast to NVARCHAR(MAX)
and then a replace on string level at the point where you export this data. You won't be able to cast this back to XML:
--Push your test XML into a variable
DECLARE @xml XML=
(
SELECT 'é' AS Accent
FOR XML PATH('')
);
SELECT @xml; --<Accent>é</Accent>
--Cast it to string and do the replacement on string-level
SELECT REPLACE(CAST(@xml AS NVARCHAR(MAX)),'é', 'é') --<Accent>é</Accent>
--Trying to cast this result back to XML fails
SELECT CAST(REPLACE(CAST(@xml AS NVARCHAR(MAX)),'é', 'é') AS XML);
XML parsing: line 1, character 16, well formed check: undeclared entity
Obvioulsy SQL Server's XML engine doesn't even know this entity...
Related Topics
Selecting by Month in Postgresql
How We Can Use Cte in Subquery in SQL Server
Adding an One-Out-Of-Two Not Null Constraint in Postgresql
Performance Difference: Condition Placed at Inner Join VS Where Clause
SQL Function Return-Type: Table VS Setof Records
SQL Distinct Keyword Bogs Down Performance
What Are the Down Sides of Using a Composite/Compound Primary Key
Is Possible to Reuse Subqueries
Take VS First Performance in Ruby on Rails
How to Select Rows Having Column Value as Null
The Wait Operation Timed Out. Asp
Inserting Data into Hive Table
How to Do an Insert Where Not Exists
How to Order Entries in a Union Without Order By
How to Drop Table Variables in SQL-Server? Should I Even Do This
Postgresql: Between with Datetime
How to Change Schema of All Tables, Views and Stored Procedures in Mssql