Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 (Year)
Try using a format file since your data file only has 4 columns. Otherwise, try OPENROWSET
or use a staging table.
myTestFormatFiles.Fmt
may look like:
9.0
4
1 SQLINT 0 3 "," 1 StudentNo ""
2 SQLCHAR 0 100 "," 2 FirstName SQL_Latin1_General_CP1_CI_AS
3 SQLCHAR 0 100 "," 3 LastName SQL_Latin1_General_CP1_CI_AS
4 SQLINT 0 4 "\r\n" 4 Year "
(source: microsoft.com)
This tutorial on skipping a column with BULK INSERT
may also help.
Your statement then would look like:
USE xta9354
GO
BULK INSERT xta9354.dbo.Students
FROM 'd:\userdata\xta9_Students.txt'
WITH (FORMATFILE = 'C:\myTestFormatFiles.Fmt')
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 1 (ID)
Seems like some value is trying to get inserted into a column that does not accept that value. So in your ID column, which requires and INT, you potentially have another character in the value. You can try to make every column a NVARCHAR(max)
, load all the data, then query and see what is in the column that isn't an INT using isnumberic()
.
CREATE TABLE [dbo].[testpostingvoucher](
[ID] NVARCHAR(MAX) NULL,
[date] NVARCHAR(MAX) NULL,
[checkdigit] varchar NULL,
[credit] NVARCHAR(MAX) NULL -- probably want to use DECIMAL(18,2) instead of INT
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 in Azure Synapse
I’m guessing you may have a header row that should be skipped. Drop your external table and then drop and recreate the external file format as follows:
CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat]
WITH ( FORMAT_TYPE = DELIMITEDTEXT ,
FORMAT_OPTIONS (
FIELD_TERMINATOR = ',',
USE_TYPE_DEFAULT = FALSE,
FIRST_ROW = 2
))
CSV impor failing - Bulk load data conversion error (type mismatch or invalid character for the specified codepage)
The Solution was to create a table from the CSV I was reading. I then red the CSV row into a dictionary and appended the row dictionary to a list. I then used SQL alchemy to bulk insert my list of row dictonaries session.bulk_insert_mappings:
import pyodbc
import sqlalchemy as db
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session,sessionmaker,mapper
from sqlalchemy import MetaData,create_engine
from datetime import datetime
import urllib
import os
import glob
import chardet
import re
import csv
def rtn_date(dtStr):
if dtStr =='':
return None
else:
try:
# rtnDate =datetime.strptime(dtStr,'%d/%m/%Y')
rtnDate =datetime.strftime(dtStr,'%F')
return rtnDate
except:
return None
log_file =r'XXXXXX.txt'
database = 'XXXXXXX'
driver_name ='ODBC Driver 11 for SQL Server'
engine = create_engine('mssql+pyodbc:{0}?driver={1}?Trusted_Connection=yes'.format(database,driver_name))
connection= engine.connect()
Base = automap_base()
Base.prepare(engine, reflect=True)
date_test_table = Base.classes.XXXXXXXXX
Session = sessionmaker(bind=engine)
session = Session()
#define folder where all csv files are
folder =r'XXXXXXX'
os.chdir(folder)
extension ='csv'
skipped =[]
all_filenames =[i for i in glob.glob('*.{}'.format(extension))]
for i in range(len(all_filenames)):
try:
with open(all_filenames[i], 'r') as csv_file:
csv_reader = csv.reader(csv_file)
buffer = []
firstline= True
for row in csv_reader :
if firstline: #skip first line
firstline = False
continue
if len(re.findall(r'\d+',str(all_filenames[i]))) > 0:
strType = 'Normal'
else:
strType = 'MonthEnd'
buffer.append({
'Date': row[0],
'ISIN': row[1],
'CUSIP': row[2],
'Ticker': row[3],
'Issuer': row[4],
'Coupon': row[5],
'Final_Maturity': row[6],
'Workout_date': row[7],
'Expected_Remaining_Life': row[8],
'Time_to_Maturity': row[9],
'Coupon_Frequency': row[10],
'Notional_Amount_Unconstrained': row[11],
'Notional_Amount_Constrained': row[12],
'PIK_Original_Amount_Issued': row[13],
'PIK_Factor': row[14],
'Redemption_Factor': row[15],
'Bid_Price': row[16],
'Ask_Price': row[17],
'Accrued_Interest': row[18],
'Coupon_Payment': row[19],
'Coupon_Adjustment': row[20],
'Ex_Dividend_Flag': row[21],
'Dirty_Price': row[22],
'Market_Value_Unconstrained': row[23],
'Market_Value_Constrained': row[24],
'Cash_Payment_Unconstrained': row[25],
'Cash_Payment_Constrained': row[26],
'Street_Yield_to_Maturity': row[27],
'Annual_Yield_to_Maturity': row[28],
'Semi_Annual_Yield_to_Maturity': row[29],
'Street_Yield_to_Worst': row[30],
'Annual_Yield_to_Worst': row[31],
'Semi_Annual_Yield_to_Worst': row[32],
'OAS_Street_Yield': row[33],
'OAS_Annual_Yield': row[34],
'OAS_Semi_Annual_Yield': row[35],
'Annual_Benchmark_Spread': row[36],
'Semi_Annual_Benchmark_Spread': row[37],
'Z_Spread': row[38],
'OAS_Spread': row[39],
'Asset_Swap_Margin': row[40],
'Simple_Margin': row[41],
'Discount_Margin': row[42],
'Duration_to_Maturity': row[43],
'Street_Modified_Duration_to_Maturity': row[44],
'Annual_Modified_Duration_to_Maturity': row[45],
'Semi_Annual_Modified_Duration_to_Maturity': row[46],
'Duration_to_Worst': row[47],
'Street_Modified_Duration_to_Worst': row[48],
'Annual_Modified_Duration_to_Worst': row[49],
'Semi_Annual_Modified_Duration_to_Worst': row[50],
'OAS_Duration': row[51],
'OAS_Modified_Duration': row[52],
'OAS_Annual_Modified_Duration': row[53],
'OAS_Semi_Annual_Modified_Duration': row[54],
'Spread_Duration': row[55],
'Z_Spread_Duration': row[56],
'Street_Convexity_to_Maturity': row[57],
'Annual_Convexity_to_Maturity': row[58],
'Semi_Annual_Convexity_to_Maturity': row[59],
'Street_Convexity_to_Worst': row[60],
'Annual_Convexity_to_Worst': row[61],
'Semi_Annual_Convexity_to_Worst': row[62],
'OAS_Convexity': row[63],
'Benchmark_ISIN': row[64],
'Daily_Return': row[65],
'Month_to_Date_Return': row[66],
'Quarter_to_Date_Return': row[67],
'Year_to_Date_Return': row[68],
'Daily_Excess_Return': row[69],
'Month_to_date_Excess_Return': row[70],
'Level_0': row[71],
'Level_1': row[72],
'Level_2': row[73],
'Level_3': row[74],
'Level_4': row[75],
'Level_5': row[76],
'Debt': row[77],
'Rating': row[78],
'Is_Callable': row[79],
'Is_Core_index': row[80],
'Is_Crossover': row[81],
'Is_Fixed_to_Float': row[82],
'Is_FRN': row[83],
'Is_Hybrid': row[84],
'Is_Perpetual': row[85],
'Is_PIK': row[86],
'Is_Sinking': row[87],
'Is_Zero_Coupon': row[88],
'_1_3_years': row[89],
'_1_5_years': row[90],
'_1_10_years': row[91],
'_3_5_years': row[92],
'_5_7_years': row[93],
'_5_10_years': row[94],
'_7_10_years': row[95],
'_5_years': row[96],
'_10_years': row[97],
'Source':str(all_filenames[i]),
'File_Type': strType
})
session.bulk_insert_mappings(date_test_table,buffer)
session.commit()
except Exception as e:
print(e)
skipped.append('{0} {1}'.format(str(all_filenames[i]),r'\n'))
print('processed {0} out of {1} files'.format(str(i+1), str(len(all_filenames)+1)))
session.close()
if len(skipped)>0:
open(log_file, 'w').close()
skip_log = open(log_file,'w')
skip_log.writelines(skipped)
skip_log.close()
Bulk Load Data Conversion Error - Can't Find Answer
It's likely that your data has an error in it. That is, that there is a character or value that can't be converted explicitly to NUMERIC
or DECIMAL
. One way to check this and fix it is to
- Change
[Delta_SM_RR] numeric(10,5)
to[Delta_SM_RR] nvarchar(256)
- Run the bulk insert
- Find your error row:
select * from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
- Fix the data at the source, or
delete from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
The last statements returns/deletes rows where there is something other than a digit, period, or hyphen.
For your date column you can follow the same logic above, by changing the column to VARCHAR
, and then find your error by using ISDATE() to find the ones which can't be converted.
Error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (Date).
Configured the "date" column to be "smalldatetime" instead of "datetime". Tried it, and everything imported 100%.
There are two potential methods to do this:
- Method 1: Discard the formatting file, it will read it in fine without it (this definitely works).
- Method 2: If LINQ-to-Entities insists on generating a "datetime" in this column, then change the formatting file to specify "smalldatetime" so it reads in properly (havn't tried this but I imagine it might work).
Related Topics
Is Null VS = Null in Where Clause + SQL Server
Dealing with System.Dbnull in Powershell
How to Change the Collation of SQLite3 Database to Sort Case Insensitively
How to Pass a Comma Separated List to a Stored Procedure
SQL Insert into Temp Table in Both If and Else Blocks
Is There Any General Rule on SQL Query Complexity VS Performance
Query to Find Nᵗʰ Max Value of a Column
SQL Server Equivalent of MySQL's Now()
How to Use Null or Empty String in SQL
A Strange Operation Problem in SQL Server: -100/-100*10 = 0
How to Correctly Do Upsert in Postgres 9.5
How to Create Sequence If Not Exists
Date_Trunc 5 Minute Interval in Postgresql
Update Multiple Tables in SQL Server Using Inner Join
SQL Server - Inner Join with Distinct
Finding Simultaneous Events in a Database Between Times