Split (Explode) Pandas Dataframe String Entry to Separate Rows

Split (explode) pandas dataframe string entry to separate rows

How about something like this:

In [55]: pd.concat([Series(row['var2'], row['var1'].split(','))              
for _, row in a.iterrows()]).reset_index()
Out[55]:
index 0
0 a 1
1 b 1
2 c 1
3 d 2
4 e 2
5 f 2

Then you just have to rename the columns

Explode rows pandas dataframe

We can use Series.str.split to parse out the relevant information in to a list prior to using explode.

df.assign(
Letters=df.Letters \
.str \
.split(" : ", expand=True)[1] \
.str.split(",") \
) \
.explode("Letters")

Letters Date
0 a 2021
1 a 2019
1 b 2019
1 c 2019
2 a 2017
2 b 2017

Please note the index is not reset in this answer, you can do that if you need by calling reset_index.

Python Dataframe Explode Rows with multiple values

From pandas docs pandas.DataFrame.explode

specify a non-empty list with each element be str or tuple

To use explode your 'tags' column needs to be a list type. Apply a function to convert your string tags separated by commas to a list then go with option 1 df.explode('tags')

Split cell into multiple rows in pandas dataframe

Here's one way using numpy.repeat and itertools.chain. Conceptually, this is exactly what you want to do: repeat some values, chain others. Recommended for small numbers of columns, otherwise stack based methods may fare better.

import numpy as np
from itertools import chain

# return list from series of comma-separated strings
def chainer(s):
return list(chain.from_iterable(s.str.split(',')))

# calculate lengths of splits
lens = df['package'].str.split(',').map(len)

# create new dataframe, repeating or chaining as appropriate
res = pd.DataFrame({'order_id': np.repeat(df['order_id'], lens),
'order_date': np.repeat(df['order_date'], lens),
'package': chainer(df['package']),
'package_code': chainer(df['package_code'])})

print(res)

order_id order_date package package_code
0 1 20/5/2018 p1 #111
0 1 20/5/2018 p2 #222
0 1 20/5/2018 p3 #333
1 3 22/5/2018 p4 #444
2 7 23/5/2018 p5 #555
2 7 23/5/2018 p6 #666

Split words from datraframe by space to rows while duplicating the info from other columns ( python,pandas)

You need to split and explode:

df2 = (df
.assign(comments=df['comments'].str.split())
.explode('comments')
)

output:

   r_id        start    comments
0 1 2021-01-01 i
0 1 2021-01-01 am
0 1 2021-01-01 the
0 1 2021-01-01 text
0 1 2021-01-01 that
0 1 2021-01-01 needs
0 1 2021-01-01 splitting
0 1 2021-01-01 by
0 1 2021-01-01 space
0 1 2021-01-01 to
0 1 2021-01-01 rows
1 2 2021-01-02 hello
1 2 2021-01-02 hello

Splitting and Visualizing in Python

Hi and welcome to StackOverflow. You mentioned countplot(). This is available in seaborn. Assuming that is what you are planning to use... Note that the countplot will count the number of entries and graph will show how many items are present once, how many are present twice, etc...
The updated code is below.

>>df
Gender KnownBrands
0 Man NIVEA MEN;GATSBY;
1 Man GATSBY;GARNIER MEN;L’OREAL MEN EXPERT;
2 Woman CLINIQUE FOR MEN;SK-II MEN;Neutrogena MEN;
3 Man NIVEA MEN;GARNIER MEN;L’OREAL MEN EXPERT;GATSBY;
4 Woman NIVEA MEN;GATSBY;

brands = df["KnownBrands"].str.split(";").explode().astype(object).reset_index()
output = brands.pivot(index="index", columns="KnownBrands", values= "KnownBrands").reset_index(drop = True).drop('', 1)

>>output.count()
KnownBrands
CLINIQUE FOR MEN 1
GARNIER MEN 2
GATSBY 4
L’OREAL MEN EXPERT 2
NIVEA MEN 3
Neutrogena MEN 1
SK-II MEN 1
dtype: int64

import seaborn as sns
sns.countplot(x=output.count())

Output plot

Sample Image

Python: How to expand column with list of values to multiple rows?

I think this will solve your issue:

import pandas as pd
df = pd.DataFrame({"Item": ["a", "b"], "Match": ["bb,cc", "dd,ee"]})
df["Match"] = df["Match"].str.split(",")
df.explode("Match")


Related Topics



Leave a reply



Submit