Pandas Split Column into Multiple Columns by Comma

Pandas split column into multiple columns by comma

In case someone else wants to split a single column (deliminated by a value) into multiple columns - try this:

series.str.split(',', expand=True)

This answered the question I came here looking for.

Credit to EdChum's code that includes adding the split columns back to the dataframe.

pd.concat([df[[0]], df[1].str.split(', ', expand=True)], axis=1)

Note: The first argument df[[0]] is DataFrame.

The second argument df[1].str.split is the series that you want to split.

split Documentation

concat Documentation

How to split a column value at comma into multiple columns and rename them as its number of column as suffix

You can use str.split to split the strings in the column and then attach the resulting DataFrame to the original DataFrame, assigning column names using its width.

temp = df['List_of_Order_Id'].str.split(',', expand=True).applymap(lambda x: np.nan if x is None else x)
df[['Order_Id_'+str(i) for i in range(1,temp.shape[1] + 1)]] = temp

           Mobile  ...               List_of_Order_Id Order_Id_1 Order_Id_2  \
0    9.163820e+08  ...                          21810      21810        NaN   
1    9.179049e+08  ...                          23387      23387        NaN   
2    9.183748e+08  ...                          21767      21767        NaN   
3    9.186110e+08  ...                          23457      23457        NaN   
4    9.187790e+08  ...                    23117,23163      23117      23163   
..            ...  ...                            ...        ...        NaN   
353  9.970647e+09  ...                          21549      21549        NaN   
354  9.971940e+09  ...                          22753      22753        NaN   
355  9.994742e+09  ...  21505,21836,22291,22539,22734      21505      21836   
356  9.994964e+09  ...                          22348      22348        NaN   
357  9.994997e+09  ...                    21100,21550      21100      21550   

    Order_Id_3 Order_Id_4 Order_Id_5  
0          NaN        NaN        NaN  
1          NaN        NaN        NaN  
2          NaN        NaN        NaN  
3          NaN        NaN        NaN  
4          NaN        NaN        NaN  
..         NaN        NaN        NaN  
353        NaN        NaN        NaN  
354        NaN        NaN        NaN  
355      22291      22539      22734  
356        NaN        NaN        NaN  
357        NaN        NaN        NaN

How to split a dataframe string column into two columns?

There might be a better way, but this here's one approach:

                            row
    0       00000 UNITED STATES
    1             01000 ALABAMA
    2  01001 Autauga County, AL
    3  01003 Baldwin County, AL
    4  01005 Barbour County, AL

df = pd.DataFrame(df.row.str.split(' ',1).tolist(),
                                 columns = ['fips','row'])

   fips                 row
0  00000       UNITED STATES
1  01000             ALABAMA
2  01001  Autauga County, AL
3  01003  Baldwin County, AL
4  01005  Barbour County, AL

How to split comma separated text into columns on pandas dataframe?

Maybe you can try this without pivot.

Create the dataframe.

import pandas as pd
import io

s = '''Data
a,b,c
a,c,d
d,e
a,e
a,b,c,d,e'''

df = pd.read_csv(io.StringIO(s), sep = "\s+")

We can use pandas.Series.str.split with expand argument equals to True. And value_counts each rows with axis = 1.

Finally fillna with zero and change the data into integer with astype(int).

df["Data"].str.split(pat = ",", expand=True).apply(lambda x : x.value_counts(), axis = 1).fillna(0).astype(int)

#
    a   b   c   d   e
0   1   1   1   0   0
1   1   0   1   1   0
2   0   0   0   1   1
3   1   0   0   0   1
4   1   1   1   1   1

And then merge it with the original column.

new = df["Data"].str.split(pat = ",", expand=True).apply(lambda x : x.value_counts(), axis = 1).fillna(0).astype(int)
pd.concat([df, new], axis = 1)

#
    Data        a   b   c   d   e
0   a,b,c       1   1   1   0   0
1   a,c,d       1   0   1   1   0
2   d,e         0   0   0   1   1
3   a,e         1   0   0   0   1
4   a,b,c,d,e   1   1   1   1   1

How to Split a column into two by comma delimiter, and put a value without comma in second column and not in first?

We can try using str.extract here:

df["Location"] = df["Origin"].str.extract(r'(.*),')
df["Country"] = df["Origin"].str.extract(r'(\w+(?: \w+)*)$')

Python or pandas split columns by comma and append into rows

The pandas DataFrame has explode method that does exactly what you want. See explode() documentation. It works with list-like object, so if the column you want to explode is of type string, then you need to split it into list. See str.split() documentation. Additionally you can remove any white spaces with Pandas map function.

Full code example:

import pandas as pd

df = pd.DataFrame({
    "x": [1,2,3,4],
    "y": ["a, b, c, d", "e, f, g", "h, i", "j, k, l, m, n"]
})

# Convert string with commas into list of string and strip spaces
df['y'] = df['y'].str.split(',').map(lambda elements: [e.strip() for e in elements])

# Explode lists in the column 'y' into separate values
df.explode('y')

Output:

Pandas: pivot comma delimited column into multiple columns

You could use str.get_dummies to get the dummy variables; then join back to df:

out = df[['id']].join(df['type'].str.get_dummies(sep=',').add_prefix('type_').replace(0, float('nan')))

Output:

   id  type_a  type_b  type_c  type_d  type_e
0   1     1.0     1.0     1.0     1.0     NaN
1   2     NaN     1.0     NaN     1.0     NaN
2   3     NaN     NaN     1.0     NaN     1.0
3   4     NaN     NaN     NaN     NaN     NaN

How to split comma separated strings in a column into different columns if they're not of same length using python or pandas in jupyter notebook

We can use a regular expression pattern to find all the matching key-value pairs from each row of column_A , then map the list of pairs from each row to dictionary in order to create records then construct a dataframe from these records

pd.DataFrame(map(dict, df['column_A'].str.findall(r'\s*([^:,]+):\s*([^,]+)')))

See the online regex demo

        Garbage Organics          Recycle   Junk
0       Tissues     Milk       Cardboards    NaN
1  Paper Towels     Eggs            Glass  Feces
2          cups      NaN  Plastic bottles    NaN

Here is an alternate approach in case you don't want to use regular expression patterns

df['column_A'].str.split(', ').explode()\
              .str.split(': ', expand=True)\
              .set_index(0, append=True)[1].unstack()

Pandas Split Column into Multiple Columns by Comma