Bulk Load Data Conversion Error (Type Mismatch or Invalid Character for the Specified Codepage) for Row 1, Column 4 (Year)

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 (Year)

Try using a format file since your data file only has 4 columns. Otherwise, try OPENROWSET or use a staging table.

myTestFormatFiles.Fmt may look like:

9.0
4
1       SQLINT        0       3       ","      1     StudentNo      ""
2       SQLCHAR       0       100     ","      2     FirstName      SQL_Latin1_General_CP1_CI_AS
3       SQLCHAR       0       100     ","      3     LastName       SQL_Latin1_General_CP1_CI_AS
4       SQLINT        0       4       "\r\n"   4     Year           "

Sample Image
_{(source: microsoft.com)}

This tutorial on skipping a column with BULK INSERT may also help.

Your statement then would look like:

USE xta9354
GO
BULK INSERT xta9354.dbo.Students
    FROM 'd:\userdata\xta9_Students.txt' 
    WITH (FORMATFILE = 'C:\myTestFormatFiles.Fmt')

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 1 (ID)

Seems like some value is trying to get inserted into a column that does not accept that value. So in your ID column, which requires and INT, you potentially have another character in the value. You can try to make every column a NVARCHAR(max), load all the data, then query and see what is in the column that isn't an INT using isnumberic().

CREATE TABLE [dbo].[testpostingvoucher](
[ID] NVARCHAR(MAX) NULL,
[date] NVARCHAR(MAX) NULL,
[checkdigit] varchar NULL,
[credit] NVARCHAR(MAX) NULL -- probably want to use DECIMAL(18,2) instead of INT
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 4 in Azure Synapse

I’m guessing you may have a header row that should be skipped. Drop your external table and then drop and recreate the external file format as follows:


    CREATE EXTERNAL FILE FORMAT [SynapseDelimitedTextFormat] 
    WITH ( FORMAT_TYPE = DELIMITEDTEXT ,
           FORMAT_OPTIONS (
             FIELD_TERMINATOR = ',',
             USE_TYPE_DEFAULT = FALSE,
             FIRST_ROW = 2
            ))

CSV impor failing - Bulk load data conversion error (type mismatch or invalid character for the specified codepage)

The Solution was to create a table from the CSV I was reading. I then red the CSV row into a dictionary and appended the row dictionary to a list. I then used SQL alchemy to bulk insert my list of row dictonaries session.bulk_insert_mappings:

import pyodbc
import sqlalchemy as db
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session,sessionmaker,mapper
from sqlalchemy import MetaData,create_engine
from datetime import datetime

import urllib
import os
import glob
import chardet
import re
import csv

def rtn_date(dtStr):
        if dtStr =='':
                return None
        else:
                try:
                        # rtnDate =datetime.strptime(dtStr,'%d/%m/%Y')
                        rtnDate =datetime.strftime(dtStr,'%F')
                        return rtnDate
                except:
                        return None

log_file =r'XXXXXX.txt'
database = 'XXXXXXX'
driver_name ='ODBC Driver 11 for SQL Server'

engine = create_engine('mssql+pyodbc:{0}?driver={1}?Trusted_Connection=yes'.format(database,driver_name))
connection= engine.connect()

Base = automap_base()
Base.prepare(engine, reflect=True)

date_test_table = Base.classes.XXXXXXXXX
Session = sessionmaker(bind=engine)
session = Session()

#define folder where all csv files are
folder =r'XXXXXXX'
os.chdir(folder)
extension ='csv'

skipped =[]
all_filenames =[i for i in glob.glob('*.{}'.format(extension))]

for i in range(len(all_filenames)):

    try:

        with open(all_filenames[i], 'r') as csv_file:
                csv_reader = csv.reader(csv_file)

                buffer = []

                firstline= True

                for row in csv_reader :
                    if firstline:    #skip first line
                        firstline = False
                        continue

                    if len(re.findall(r'\d+',str(all_filenames[i]))) > 0:
                        strType = 'Normal'
                    else:
                        strType = 'MonthEnd'

                    buffer.append({
                        'Date': row[0],
                        'ISIN': row[1],
                        'CUSIP': row[2],
                        'Ticker': row[3],
                        'Issuer': row[4],
                        'Coupon': row[5],
                        'Final_Maturity':  row[6],
                        'Workout_date':  row[7],
                        'Expected_Remaining_Life': row[8],
                        'Time_to_Maturity': row[9],
                        'Coupon_Frequency': row[10],
                        'Notional_Amount_Unconstrained': row[11],
                        'Notional_Amount_Constrained': row[12],
                        'PIK_Original_Amount_Issued': row[13],
                        'PIK_Factor': row[14],
                        'Redemption_Factor': row[15],
                        'Bid_Price': row[16],
                        'Ask_Price': row[17],
                        'Accrued_Interest': row[18],
                        'Coupon_Payment': row[19],
                        'Coupon_Adjustment': row[20],
                        'Ex_Dividend_Flag': row[21],
                        'Dirty_Price': row[22],
                        'Market_Value_Unconstrained': row[23],
                        'Market_Value_Constrained': row[24],
                        'Cash_Payment_Unconstrained': row[25],
                        'Cash_Payment_Constrained': row[26],
                        'Street_Yield_to_Maturity': row[27],
                        'Annual_Yield_to_Maturity': row[28],
                        'Semi_Annual_Yield_to_Maturity': row[29],
                        'Street_Yield_to_Worst': row[30],
                        'Annual_Yield_to_Worst': row[31],
                        'Semi_Annual_Yield_to_Worst': row[32],
                        'OAS_Street_Yield': row[33],
                        'OAS_Annual_Yield': row[34],
                        'OAS_Semi_Annual_Yield': row[35],
                        'Annual_Benchmark_Spread': row[36],
                        'Semi_Annual_Benchmark_Spread': row[37],
                        'Z_Spread': row[38],
                        'OAS_Spread': row[39],
                        'Asset_Swap_Margin': row[40],
                        'Simple_Margin': row[41],
                        'Discount_Margin': row[42],
                        'Duration_to_Maturity': row[43],
                        'Street_Modified_Duration_to_Maturity': row[44],
                        'Annual_Modified_Duration_to_Maturity': row[45],
                        'Semi_Annual_Modified_Duration_to_Maturity': row[46],
                        'Duration_to_Worst': row[47],
                        'Street_Modified_Duration_to_Worst': row[48],
                        'Annual_Modified_Duration_to_Worst': row[49],
                        'Semi_Annual_Modified_Duration_to_Worst': row[50],
                        'OAS_Duration': row[51],
                        'OAS_Modified_Duration': row[52],
                        'OAS_Annual_Modified_Duration': row[53],
                        'OAS_Semi_Annual_Modified_Duration': row[54],
                        'Spread_Duration': row[55],
                        'Z_Spread_Duration': row[56],
                        'Street_Convexity_to_Maturity': row[57],
                        'Annual_Convexity_to_Maturity': row[58],
                        'Semi_Annual_Convexity_to_Maturity': row[59],
                        'Street_Convexity_to_Worst': row[60],
                        'Annual_Convexity_to_Worst': row[61],
                        'Semi_Annual_Convexity_to_Worst': row[62],
                        'OAS_Convexity': row[63],
                        'Benchmark_ISIN': row[64],
                        'Daily_Return': row[65],
                        'Month_to_Date_Return': row[66],
                        'Quarter_to_Date_Return': row[67],
                        'Year_to_Date_Return': row[68],
                        'Daily_Excess_Return': row[69],
                        'Month_to_date_Excess_Return': row[70],
                        'Level_0': row[71],
                        'Level_1': row[72],
                        'Level_2': row[73],
                        'Level_3': row[74],
                        'Level_4': row[75],
                        'Level_5': row[76],
                        'Debt': row[77],
                        'Rating': row[78],
                        'Is_Callable': row[79],
                        'Is_Core_index': row[80],
                        'Is_Crossover': row[81],
                        'Is_Fixed_to_Float': row[82],
                        'Is_FRN': row[83],
                        'Is_Hybrid': row[84],
                        'Is_Perpetual': row[85],
                        'Is_PIK': row[86],
                        'Is_Sinking': row[87],
                        'Is_Zero_Coupon': row[88],
                        '_1_3_years': row[89],
                        '_1_5_years': row[90],
                        '_1_10_years': row[91],
                        '_3_5_years': row[92],
                        '_5_7_years': row[93],
                        '_5_10_years': row[94],
                        '_7_10_years': row[95],
                        '_5_years': row[96],
                        '_10_years': row[97],
                        'Source':str(all_filenames[i]),
                        'File_Type': strType
                    })

                session.bulk_insert_mappings(date_test_table,buffer)
                session.commit()

    except Exception as e:
            print(e)
            skipped.append('{0} {1}'.format(str(all_filenames[i]),r'\n'))

    print('processed {0} out of {1} files'.format(str(i+1), str(len(all_filenames)+1)))

session.close()

if len(skipped)>0:

    open(log_file, 'w').close()
    skip_log = open(log_file,'w')
    skip_log.writelines(skipped)
    skip_log.close()

Bulk Load Data Conversion Error - Can't Find Answer

It's likely that your data has an error in it. That is, that there is a character or value that can't be converted explicitly to NUMERIC or DECIMAL. One way to check this and fix it is to

Change [Delta_SM_RR] numeric(10,5) to [Delta_SM_RR] nvarchar(256)
Run the bulk insert
Find your error row: select * from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'
Fix the data at the source, or delete from Example_Table where [Delta_SM_RR] like '%[^-.0-9]%'

The last statements returns/deletes rows where there is something other than a digit, period, or hyphen.

For your date column you can follow the same logic above, by changing the column to VARCHAR, and then find your error by using ISDATE() to find the ones which can't be converted.

Error: Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (Date).

Configured the "date" column to be "smalldatetime" instead of "datetime". Tried it, and everything imported 100%.

There are two potential methods to do this:

Method 1: Discard the formatting file, it will read it in fine without it (this definitely works).
Method 2: If LINQ-to-Entities insists on generating a "datetime" in this column, then change the formatting file to specify "smalldatetime" so it reads in properly (havn't tried this but I imagine it might work).

Bulk Load Data Conversion Error (Type Mismatch or Invalid Character for the Specified Codepage) for Row 1, Column 4 (Year)