How to Split a Comma-Separated Value to Columns

How to split a comma-separated value to columns

CREATE FUNCTION [dbo].[fn_split_string_to_column] (
    @string NVARCHAR(MAX),
    @delimiter CHAR(1)
    )
RETURNS @out_put TABLE (
    [column_id] INT IDENTITY(1, 1) NOT NULL,
    [value] NVARCHAR(MAX)
    )
AS
BEGIN
    DECLARE @value NVARCHAR(MAX),
        @pos INT = 0,
        @len INT = 0

    SET @string = CASE 
            WHEN RIGHT(@string, 1) != @delimiter
                THEN @string + @delimiter
            ELSE @string
            END

    WHILE CHARINDEX(@delimiter, @string, @pos + 1) > 0
    BEGIN
        SET @len = CHARINDEX(@delimiter, @string, @pos + 1) - @pos
        SET @value = SUBSTRING(@string, @pos, @len)

        INSERT INTO @out_put ([value])
        SELECT LTRIM(RTRIM(@value)) AS [column]

        SET @pos = CHARINDEX(@delimiter, @string, @pos + @len) + 1
    END

    RETURN
END

Split Comma Separated values into multiple column

Your sample data may not need any splitting. You want to move the data to a column based on the value it finds. You can do this a bit simpler than splitting the data. This works just fine for your sample data.

declare @Something table
(
    Combined_Column varchar(10)
)

insert @Something values
('1,2,3')
, ('2')
, ('1,3')
, ('1,2,3,4')
, ('1,3,4')
, ('1')
, ('4')

select *
    , col1 = case when charindex('1', s.Combined_Column) > 0 then 1 end
    , col2 = case when charindex('2', s.Combined_Column) > 0 then 2 end
    , col3 = case when charindex('3', s.Combined_Column) > 0 then 3 end
    , col4 = case when charindex('4', s.Combined_Column) > 0 then 4 end
from @Something s

how to split the comma separated value into columns

first create function to split values

create function [dbo].[udf_splitstring] (@tokens    varchar(max),
                                   @delimiter varchar(5))
returns @split table (
  token varchar(200) not null )
as



  begin

      declare @list xml

      select @list = cast('<a>'
                          + replace(@tokens, @delimiter, '</a><a>')
                          + '</a>' as xml)

      insert into @split
                  (token)
      select ltrim(t.value('.', 'varchar(200)')) as data
      from   @list.nodes('/a') as x(t)

      return

  end

SELECT 
max(CASE WHEN TOKEN='CLAR'   THEN TOKEN END) 'NAME1' ,
max(CASE WHEN TOKEN='ALWIN'  THEN TOKEN END) 'NAME2', 
max(CASE WHEN TOKEN='ANTONY' THEN TOKEN END) 'NAME3', 
max(CASE WHEN TOKEN='RINU'   THEN TOKEN END) 'NAME4', 
max(CASE WHEN TOKEN='DAMI'   THEN TOKEN END) 'NAME5', 
max(CASE WHEN TOKEN='PRINCE' THEN TOKEN END) 'NAME6'
FROM #Table1 as t1
CROSS APPLY [dbo].UDF_SPLITSTRING(name,',') as t2

output

NAME1   NAME2   NAME3   NAME4   NAME5   NAME6
clar    alwin   antony  rinu    dami    prince

How to split comma separated text into columns on pandas dataframe?

Maybe you can try this without pivot.

Create the dataframe.

import pandas as pd
import io

s = '''Data
a,b,c
a,c,d
d,e
a,e
a,b,c,d,e'''

df = pd.read_csv(io.StringIO(s), sep = "\s+")

We can use pandas.Series.str.split with expand argument equals to True. And value_counts each rows with axis = 1.

Finally fillna with zero and change the data into integer with astype(int).

df["Data"].str.split(pat = ",", expand=True).apply(lambda x : x.value_counts(), axis = 1).fillna(0).astype(int)

#
    a   b   c   d   e
0   1   1   1   0   0
1   1   0   1   1   0
2   0   0   0   1   1
3   1   0   0   0   1
4   1   1   1   1   1

And then merge it with the original column.

new = df["Data"].str.split(pat = ",", expand=True).apply(lambda x : x.value_counts(), axis = 1).fillna(0).astype(int)
pd.concat([df, new], axis = 1)

#
    Data        a   b   c   d   e
0   a,b,c       1   1   1   0   0
1   a,c,d       1   0   1   1   0
2   d,e         0   0   0   1   1
3   a,e         1   0   0   0   1
4   a,b,c,d,e   1   1   1   1   1

How to split comma separated strings in a column into different columns if they're not of same length using python or pandas in jupyter notebook

We can use a regular expression pattern to find all the matching key-value pairs from each row of column_A , then map the list of pairs from each row to dictionary in order to create records then construct a dataframe from these records

pd.DataFrame(map(dict, df['column_A'].str.findall(r'\s*([^:,]+):\s*([^,]+)')))

See the online regex demo

        Garbage Organics          Recycle   Junk
0       Tissues     Milk       Cardboards    NaN
1  Paper Towels     Eggs            Glass  Feces
2          cups      NaN  Plastic bottles    NaN

Here is an alternate approach in case you don't want to use regular expression patterns

df['column_A'].str.split(', ').explode()\
              .str.split(': ', expand=True)\
              .set_index(0, append=True)[1].unstack()

How to split a comma separated value to columns together other columns

You may try with the next approach, using LEFT(), RIGHT(), LEN() and CHARINDEX() functions:

Table:

CREATE TABLE Data (
   AccountID varchar(7),      
   GEO varchar(50)
)
INSERT INTO Data
   (AccountID, GEO)
VALUES
   ('CT-2000', '9.9582925,-84.19607')

Statement:

SELECT 
   AccountID,
   LEFT(GEO, CHARINDEX(',', GEO) - 1) AS Lat,
   RIGHT(GEO, LEN(GEO) - CHARINDEX(',', GEO)) AS Long
FROM Data

Result:

AccountID   Lat         Long
CT-2000     9.9582925   -84.19607

Split comma separated values into target table with fixed number of columns

It is typically bad design to store CSV values in a single column. If at all possible, use an array or a properly normalized design instead.

While stuck with your current situation ...

For known small maximum number of elements

A simple solution without trickery or recursion will do:

SELECT id, 1 AS rnk
     , split_part(csv, ', ', 1) AS c1
     , split_part(csv, ', ', 2) AS c2
     , split_part(csv, ', ', 3) AS c3
     , split_part(csv, ', ', 4) AS c4
     , split_part(csv, ', ', 5) AS c5
FROM   tbl
WHERE  split_part(csv, ', ', 1) <> '' -- skip empty rows

UNION ALL
SELECT id, 2
     , split_part(csv, ', ', 6)
     , split_part(csv, ', ', 7)
     , split_part(csv, ', ', 8)
     , split_part(csv, ', ', 9)
     , split_part(csv, ', ', 10)
FROM   tbl
WHERE  split_part(csv, ', ', 6) <> '' -- skip empty rows

-- three more blocks to cover a maximum "around 20"

ORDER  BY id, rnk;

db<>fiddle here

id being the PK of the original table.

This assumes ', ' as separator, obviously.

You can adapt easily.

Split comma separated column data into additional columns

For unknown number of elements

Various ways. One way use regexp_replace() to replace every fifth separator before unnesting ...

-- for any number of elements
SELECT t.id, c.rnk
     , split_part(c.csv5, ', ', 1) AS c1
     , split_part(c.csv5, ', ', 2) AS c2
     , split_part(c.csv5, ', ', 3) AS c3
     , split_part(c.csv5, ', ', 4) AS c4
     , split_part(c.csv5, ', ', 5) AS c5
FROM   tbl t
     , unnest(string_to_array(regexp_replace(csv, '((?:.*?,){4}.*?),', '\1;', 'g'), '; ')) WITH ORDINALITY c(csv5, rnk)
ORDER  BY t.id, c.rnk;

db<>fiddle here

This assumes that the chosen separator ; never appears in your strings. (Just like , can never appear.)

The regular expression pattern is the key: '((?:.*?,){4}.*?),'

(?:) ... “non-capturing” set of parentheses

() ... “capturing” set of parentheses

*? ... non-greedy quantifier

{4}? ... sequence of exactly 4 matches

The replacement '\1;' contains the back-reference \1.

'g' as fourth function parameter is required for repeated replacement.

Fill from right to left

(Like you added in How to put values starting from the right side into columns?)

Simply count down numbers like:

SELECT t.id, c.rnk
     , split_part(c.csv5, ', ', 5) AS c1
     , split_part(c.csv5, ', ', 4) AS c2
     , split_part(c.csv5, ', ', 3) AS c3
     , split_part(c.csv5, ', ', 2) AS c4
     , split_part(c.csv5, ', ', 1) AS c5
FROM ...

db<>fiddle here

How to Split a Comma-Separated Value to Columns

How to split a comma-separated value to columns

Split Comma Separated values into multiple column

how to split the comma separated value into columns

How to split comma separated text into columns on pandas dataframe?

How to split comma separated strings in a column into different columns if they're not of same length using python or pandas in jupyter notebook

How to split a comma separated value to columns together other columns

Split comma separated values into target table with fixed number of columns

For known small maximum number of elements

For unknown number of elements

Fill from right to left

Related Topics

Leave a reply