Split a Single Column of Data with Comma Delimiters into Multiple Columns in Ssis

Split a single column of data with comma delimiters into multiple columns in SSIS

What you can do is to read the file as is.
And Split those values in a script task.

So from source go to a script task.
Then in the script task as input column, select the column containing those values (InputColumn1). Then specify the output columns (If I am right I see you have 5, so specify 5 (OutputColumn1 - 5)).

After that is done, go to the script itself (C#).

Under:

public override void Input0_ProcessInputRow(Input0Buffer Row)
{
}

Put the following code in there:

var ColumnValue = Row.InputColumn1.Split(',');

Row.OutputColumn1 = ColumnValue[0];
Row.OutputColumn2 = ColumnValue[1];
Row.OutputColumn3 = ColumnValue[2];
Row.OutputColumn4 = ColumnValue[3];
Row.OutputColumn5 = ColumnValue[4];

After the script task all the columns from the Source as well as the OutputCoulmns1-5 will be available and you can do what you have to.

OUTPUT

 Data      | ID | Source |OutputColumn1 |OutputColumn2|  etc. 3-5
================================================================
a,b,c,d,e | 1 | a.csv | a | b

f,g,h,i,j | 2 | b.csv | f | g

Please ask if something is not clear.

Using SSIS OR T-SQL Split a column of quoted & unquoted comma separated values into multiple columns

As you were told already, TSQL is the wrong tool for this. Nevertheless this can be done (at least for the set given). If this is a one-time action you might give it a try. If this is a re-occurring task in a real-life scenario I'd try to get the data in an appropriate format.

However, this would work for the given lines:

DECLARE @t1 TABLE(ID INT IDENTITY, YourString NVARCHAR(1000));
INSERT INTO @t1 VALUES(N'"Pacey LLC.",213830ZZ,11/1/2017,11/1/2017,"297,311.74","2,371.40",0.00,"1,325.18",0.00,42.22,"123,986.56"')
,(N'Mike The Miker,9814140VCD,12/1/2018,12/1/2018,"3,917,751.99","419,743.54","36,642.66","344,090.43",0.00,10.00,"2,434,671.06"');

--Your data includes dates in a culture specific format (something really! bad)

--Better switch to ISO8601

--Setting the date format will help, but is NOT recommended

SET DATEFORMAT dmy;

--the first cte will use APPLY together with a computed TOP()

--This will allow to get each single character, one by one.

WITH singleChars AS
(
SELECT t.ID
,A.Pos
,SUBSTRING(t.YourString,A.POs,1) AS CharOnPos
FROM @t1 t
CROSS APPLY(SELECT TOP (LEN(t.YourString)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Pos) --master..spt_values can be any table with sufficient rows
)

--we continue with a recursive cte

--it will run through the string and find if we are within a quoted area or not

,recCTE AS
(
SELECT *
,CASE WHEN CharOnPos='"' THEN 1 ELSE 0 END AS QuoteIsOpen
,CAST(CharOnPos AS NVARCHAR(MAX)) AS GrowingString
FROM singleChars WHERE Pos=1

UNION ALL

SELECT sc.ID,sc.Pos,sc.CharOnPos
,A.QuoteIsStillOpen
,CONCAT(GrowingString,CASE WHEN sc.CharOnPos=N',' AND A.QuoteIsStillOpen=0 THEN N'$%&' ELSE sc.CharOnPos END)
FROM singleChars sc
INNER JOIN recCTE r ON sc.ID = r.ID AND sc.Pos=r.Pos+1
CROSS APPLY(VALUES(CASE WHEN sc.CharOnPos='"' THEN CASE WHEN r.QuoteIsOpen=1 THEN 0 ELSE 1 END ELSE r.QuoteIsOpen END )) A(QuoteIsStillOpen)
)

--this CTE performs a trick with TOP 1 WITH TIES together with ORDER BY a partitioned ROW_NUMBER()

--The result will include the final string of the recursion by ID

,newlySeparated AS
(
SELECT TOP 1 WITH TIES * FROM recCTE
ORDER BY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Pos DESC)
)

--The final SELECT uses a trick to split strings position- and type-safe

SELECT A.*
FROM newlySeparated ns
CROSS APPLY OPENJSON(CONCAT(N'[["',REPLACE(REPLACE(ns.GrowingString,'"',''),'$%&','","'),N'"]]'))
WITH(Company NVARCHAR(100) '$[0]'
,Code1 NVARCHAR(100) '$[1]'
,Date1 DATE '$[2]'
,Date2 DATE '$[3]'
,Decimal1 NVARCHAR(100) '$[4]' --Using a numbers type might work here, this depends on your machine
,Decimal2 NVARCHAR(100) '$[5]'
,Decimal3 NVARCHAR(100) '$[6]'
,Decimal4 NVARCHAR(100) '$[7]'
,Decimal5 NVARCHAR(100) '$[8]'
,Decimal6 NVARCHAR(100) '$[9]'
,Decimal7 NVARCHAR(100) '$[10]') A
OPTION(MAXRECURSION 0);

The result

+----------------+------------+------------+------------+--------------+------------+-----------+------------+------+-------+--------------+
| Pacey LLC. | 213830ZZ | 2017-01-11 | 2017-01-11 | 297,311.74 | 2,371.40 | 0.00 | 1,325.18 | 0.00 | 42.22 | 123,986.56 |
+----------------+------------+------------+------------+--------------+------------+-----------+------------+------+-------+--------------+
| Mike The Miker | 9814140VCD | 2018-01-12 | 2018-01-12 | 3,917,751.99 | 419,743.54 | 36,642.66 | 344,090.43 | 0.00 | 10.00 | 2,434,671.06 |
+----------------+------------+------------+------------+--------------+------------+-----------+------------+------+-------+--------------+

Split Comma Separated values into multiple column

Your sample data may not need any splitting. You want to move the data to a column based on the value it finds. You can do this a bit simpler than splitting the data. This works just fine for your sample data.

declare @Something table
(
Combined_Column varchar(10)
)

insert @Something values
('1,2,3')
, ('2')
, ('1,3')
, ('1,2,3,4')
, ('1,3,4')
, ('1')
, ('4')

select *
, col1 = case when charindex('1', s.Combined_Column) > 0 then 1 end
, col2 = case when charindex('2', s.Combined_Column) > 0 then 2 end
, col3 = case when charindex('3', s.Combined_Column) > 0 then 3 end
, col4 = case when charindex('4', s.Combined_Column) > 0 then 4 end
from @Something s

Parse one field into several, based on delimiter

You can do this using a script component

In this answer, I will assume that the data source is an SQL Server Table, with one column Column0

In the Integration Service Package, follow these steps:

  1. Add a Data Flow Task

  2. Create an OLEDB Connection manager, and configure it

  3. In the Data Flow Task add an OLEDB Source and choose the Table that contains the Column you need to split

  4. Add a Script Component (Set its type to Transformation)

  5. Connect the OLEDB Source to the script component

  6. Change that Script Language to Visual Basic

  7. In the Script Component Select Column0 as Input

  8. Go To the Inputs Outputs Tab

  9. Add 8 Columns to Output0 (ex: outColumn0, outColumn1 ... outColumn8) with DT-STR data type

  10. In the Script window write the following script:

    Imports System.Linq

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    If Not Row.Column0_IsNull AndAlso
    Not String.IsNullOrEmpty(Row.Column0.Trim) Then

    Dim strColumn As String = Row.Column0
    Dim intVerticalBarCount As Integer

    intVerticalBarCount = strColumn.Count(Function(x) x = CChar("|"))

    If intVerticalBarCount <= 8 Then strColumn = strColumn & "".PadRight((8 - intVerticalBarCount), CChar("|"))

    Dim strOutputColumns As String() = strColumn.Split(CChar("|"))

    Row.outColumn0 = strOutputColumns(0)
    Row.outColumn1 = strOutputColumns(1)
    Row.outColumn2 = strOutputColumns(2)
    Row.outColumn3 = strOutputColumns(3)
    Row.outColumn4 = strOutputColumns(4)
    Row.outColumn5 = strOutputColumns(5)
    Row.outColumn6 = strOutputColumns(6)
    Row.outColumn7 = strOutputColumns(7)

    Else

    Row.outColumn0_IsNull = True
    Row.outColumn1_IsNull = True
    Row.outColumn2_IsNull = True
    Row.outColumn3_IsNull = True
    Row.outColumn4_IsNull = True
    Row.outColumn5_IsNull = True
    Row.outColumn6_IsNull = True
    Row.outColumn7_IsNull = True

    End If

    End Sub

In the code above, first we get the number of occurrence of vertical bars | in the column; if it is less than 8, we add the missing vertical bars, then we split the column and assign it to the output columns.

SSIS Flat File Source Not Splitting Column by Comma

I've found the solution, needed to remove a few rows to get the headers to split, using the "Header row to skip" entry. Thanks for all your help!

Sample Image



Related Topics



Leave a reply



Submit