Import Fixed Width Text to SQL

Import fixed width text to SQL

When you feel more at home with SQL than importing tools, you could bulk import the file into a single VARCHAR(255) column in a staging table. Then process all the records with SQL and transform them to your destination table:

CREATE TABLE #DaTable(MyString VARCHAR(255)) 
INSERT INTO #DaTable(MyString) VALUES ('99 0882300 25 YATES ANTHONY V MAY 01 12 04 123456 12345678')

INSERT INTO FInalTable(Col1, Col2, Col3, Name)
SELECT CAST(SUBSTRINg(MyString, 1, 3) AS INT) as Col1,
CAST(SUBSTRING(MyString, 4, 7) AS INT) as Col2,
CAST(SUBSTRING(MyString, 12, 3) AS INT) as Col3,
SUBSTRING(MyString, 15, 6) as Name
FROM #DaTable

result: 99 882300 25 YATES

Fixed width file to sql server wide table

Here is a sample using PIVOT:

CREATE TABLE dbo.rawcompanyinfo_delimited(ID smallint NOT NULL, Data VARCHAR(MAX) NULL, State CHAR(2));

INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(100,'ABCINC    111  333.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(200,'APPLE 213 333.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(300,'BTEC 100 123.5 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(400,'S INC 123 333.0 USD','PA')
INSERT INTO dbo.rawcompanyinfo_delimited(id,data,State)values(500,'B INC 145 123.2 USD','PA')

CREATE TABLE CompaniesLenInfo(Startingposition int not null, Length int not null, columnnames varchar(100) not null)
insert into CompaniesLenInfo(Startingposition,Length,columnnames)VALUES(1,10,'CompanyName')
insert into CompaniesLenInfo(Startingposition,Length,columnnames)VALUES(11,3,'CompanyID')
insert into CompaniesLenInfo(Startingposition,Length,columnnames)VALUES(15,5,'TotalIncome')
insert into CompaniesLenInfo(Startingposition,Length,columnnames)VALUES(21,3,'Currency')

SELECT *
FROM (
SELECT r.ID,r.State,SUBSTRING(r.Data,ci.Startingposition,ci.Length) AS val,ci.columnnames
FROM rawcompanyinfo_delimited AS r,CompaniesLenInfo AS ci
) AS t PIVOT(MAX(val) FOR columnnames IN (CompanyName,CompanyID,TotalIncome,Currency) ) p

+-----+-------+-------------+-----------+-------------+----------+
| ID | State | CompanyName | CompanyID | TotalIncome | Currency |
+-----+-------+-------------+-----------+-------------+----------+
| 100 | PA | ABCINC | 111 | 333. | US |
| 200 | PA | APPLE | 213 | 333. | US |
| 300 | PA | BTEC | 100 | 123. | US |
| 400 | PA | S INC | 123 | 333. | US |
| 500 | PA | B INC | 145 | 123. | US |
+-----+-------+-------------+-----------+-------------+----------+

Import fixed width file with virtual decimals

When dealing with implied decimal places, there are really two approaches: string manipulation, as Hadi demonstrates, to put the decimal character into the string for an explicit value.

The second approach is to just math it out.

Either approach will result in you having to define the column in the flat file connection manager differently than the target type. Once the data is injected into the pipeline, then you will need to modify it to meet the destination type.

If you use Hadi's solution, save yourself the trouble of multiple casts (data type changes) and just import the value as a string. You perform string manipulation using a Derived Column Transformation to split the string apart at the location of the implicit decimal place. Logic approximately

SUBSTRING(MyColumn, 1, IntegerDigits) + "." + RIGHT(MyColumn, ScaleDigits)

where IntegerDigits and ScaleDigits are the number of digits before and after the decimal place.

I would call that column something like MyColumnWithDecimalPlace in the Derived Column Transformation.

I would then add a second Derived Column Transformation (or a Data Conversion Transformation) after the above Derived Column Transformation. In this step, we'll convert this string value to a numeric value. Why two steps? If something is weird with the data, I can put a data view (or data tap) between those two points and eye ball what is wrong. If I try to do it all in one fell swoop, then I can't see whether it's the adding a decimal place into a string that is failing or the string value I just built is broken and can't be turned into a number.

The other approach would be to define the source column as something like an Integer and then use a Derived Column Transformation to math it out. That would look something like

MyColumn / POWER(10, ScaleDigits)

again where ScaleDigits is the number of decimal places we expect to have. I tend to favor this approach as it's simpler - one operation and done.

How to import a fixed width flat file into database using SSIS?

Here is a sample package created using SSIS 2008 R2 that explains how to import a flat file into a database table.

  • Create a fixed-width flat file named Fixed_Width_File.txt with data as shown in the screenshot. The screenshot uses Notepad++ to display the file contents. It has the capability to show the special characters like carriage return and line feed. CR LF denotes the row delimiters Carriage return and Line feed.

Flat file data

  • In the SQL server database, create a table named dbo.FlatFile using the create script provided under SQL Scripts section.

  • Create a new SSIS package and add a new OLE DB Connection manager that would connect to the SQL Server database. Let's assume that the OLE DB Connection manager is named as SQLServer.

Connection manager

  • On the package's control flow tab, place a Data Flow Task.

Data flow task

  • Double-click on the data flow task and you will be taken to the data flow tab. On the data flow tab, place a Flat File Source. Double-click on the flat file source and the Flat File Source Editor will appear. Click the New button to open the Flat File Connection Manager Editor.

  • On the General section of the Flat File Source Editor, enter a value in Connection manager name (say Source) and browse to the flat file location and select the file. This example uses the sample file in the path C:\temp\Fixed_Width_File.txt If you have header rows in your file, you can enter a value 1 in the Header rows to skip textbox to skip the header row.

Flat file connection manager editor General

  • Click on the Columns section. Change the font according to your choice I chose Courier New so I could see more data with less scrolling. Enter the value 69 in the Row width text box. This value is the sum of width of all your columns + 2 for the row delimiter. Once you have set the correct row width, you should see the fixed width file data correctly on the Source data columns section. Now, you have to click at the appropriate locations to determine the column limits. Note the sections 4, 5, 6 and in the below screenshot.

Flat file connection manager editor Columns

  • Click on the Advanced section. You will notice 5 columns created for you automatically based on the column limits that we set on the Columns section in the previous step. The fifth column is for row delimiter.

Flat file connection manager editor Advanced

  • Rename the column names as FirstName, LastName, Id, Date and RowDelimiter

Flat file connection manager editor Advanced Renamed

  • By default, the columns will be set with DataType string [DT_STR]. If we are fairly certain, that a certain column will be of different data type, we can configure it in the Advanced section. We will change Id column to be of data type four-byte signed integer [DT_I4] and Date column to be of data type date [DT_DATE]

Flat file connection manager editor Advanced Id column

Flat file connection manager editor Advanced Date column

  • Click on the Preview section. The data will be shown as per the column configuration.

Flat file connection manager editor Preview

  • Click OK on the Flat file connection manager editor and the flat file connection will be assigned to the Flat File Source in the data flow task.

Flat file editor connection

  • On the Flat File Source Editor, click on the Columns section. You will notice the columns that were configured in the flat file connection manager. Uncheck the RowDelimiter because we won't need that.

Flat file editor columns

  • On the data flow task, place an OLE DB Destination. Connect the output from the Flat file source to the OLE DB Destination.

Data flow task

  • On the OLE DB Destination Editor, select the OLE DB Connection manager named SQLServer and set the Name of the table or the view drop down to [dbo].[FlatFile]

OLE DB Destination connection

  • On the OLE DB Destination Editor, click on the Mappings section. Since the column names in the flat file connection manager are same as the columns in the database, the mapping will take place automatically. If the names are different, you have to manually map the columns. Click OK.

OLE DB Destination columns

  • Now the package is ready. Execute the package to load the fixed-width flat file data into the database.

Package execution

  • If you query the table dbo.FlatFile in the database, you will notice the flat file data imported into the database.

Data imported into table

This sample should give you an idea about how to import fixed-width flat file into database. It doesn't explain how to handle error logging but this should get you started and help you discover other SSIS related features when you play with packages.

Hope that helps.

SQL Scripts:

CREATE TABLE [dbo].[FlatFile](
[Id] [int] NOT NULL,
[FirstName] [varchar](25) NOT NULL,
[LastName] [varchar](25) NOT NULL,
[Date] [datetime] NOT NULL
)

Import fixed-width text file into sqlite

The sqlite3 tools imports only CSV files.

There are third-party tools that can import fixed-width files, but this answer shows how to do this inside SQLite with string functions.

Bulk Insert with Format File (Fixed Width) - Unexpected end of file was encountered

Looking at your sample text file, it looks like you have a row terminator that is carriage return ({CR}) + linefeed ({LF}).

You can inspect this by opening the text file with a text editor that can show special symbols. I can recommend Notepad++ which is free and good for this purpose (Menu View>Show Symbol>Show All Characters).

If the row terminator is indeed {CR}{LF}, you should use xsi:type="CharTerm" along with a TERMINATOR="\r\n" attribute for the last <FIELD> in the <RECORD> element:

<RECORD>
...
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR="\r\n" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>

You can find more information on fixed field import in the following link: XML Format Files (SQL Server) # Importing fixed-length or fixed-width fields

SQL Server 2005/2008 - Import a fixed width text file via the command line?

You do have similar functionality with SQL Server. I would encourage you to learn about format files. This page, from Microsoft, does a fairly good job of explaining it.

http://msdn.microsoft.com/en-us/library/ms178129.aspx

I would also encourage you to read this blog:

6 ways to import data into SQL Server



Related Topics



Leave a reply



Submit