Split Column to Multiple Rows

How to split text in a column into multiple rows

This splits the Seatblocks by space and gives each its own row.

In [43]: df
Out[43]:
CustNum CustomerName ItemQty Item Seatblocks ItemExt
0 32363 McCartney, Paul 3 F04 2:218:10:4,6 60
1 31316 Lennon, John 25 F01 1:13:36:1,12 1:13:37:1,13 300

In [44]: s = df['Seatblocks'].str.split(' ').apply(Series, 1).stack()

In [45]: s.index = s.index.droplevel(-1) # to line up with df's index

In [46]: s.name = 'Seatblocks' # needs a name to join

In [47]: s
Out[47]:
0 2:218:10:4,6
1 1:13:36:1,12
1 1:13:37:1,13
Name: Seatblocks, dtype: object

In [48]: del df['Seatblocks']

In [49]: df.join(s)
Out[49]:
CustNum CustomerName ItemQty Item ItemExt Seatblocks
0 32363 McCartney, Paul 3 F04 60 2:218:10:4,6
1 31316 Lennon, John 25 F01 300 1:13:36:1,12
1 31316 Lennon, John 25 F01 300 1:13:37:1,13

Or, to give each colon-separated string in its own column:

In [50]: df.join(s.apply(lambda x: Series(x.split(':'))))
Out[50]:
CustNum CustomerName ItemQty Item ItemExt 0 1 2 3
0 32363 McCartney, Paul 3 F04 60 2 218 10 4,6
1 31316 Lennon, John 25 F01 300 1 13 36 1,12
1 31316 Lennon, John 25 F01 300 1 13 37 1,13

This is a little ugly, but maybe someone will chime in with a prettier solution.

Splitting a column into multiple rows

You can first split Code column on comma , then explode it to get the desired output.

df['Code']=df['Code'].str.split(',')
df=df.explode('Code')

OUTPUT:

  ID  A  B  C  D Code
0 1 a z s m AB
0 1 a z s m BC
0 1 a z s m A
1 2 b x d j AD
1 2 b x d j KL
2 3 c y w j AD
2 3 c y w j KL
3 4 a x h AB
3 4 a x h BC
4 5 b y s m A
5 6 b z s h A
6 7 c x s h B

If needed, you can replace empty string by NaN

Split cell into multiple rows in pandas dataframe

Here's one way using numpy.repeat and itertools.chain. Conceptually, this is exactly what you want to do: repeat some values, chain others. Recommended for small numbers of columns, otherwise stack based methods may fare better.

import numpy as np
from itertools import chain

# return list from series of comma-separated strings
def chainer(s):
return list(chain.from_iterable(s.str.split(',')))

# calculate lengths of splits
lens = df['package'].str.split(',').map(len)

# create new dataframe, repeating or chaining as appropriate
res = pd.DataFrame({'order_id': np.repeat(df['order_id'], lens),
'order_date': np.repeat(df['order_date'], lens),
'package': chainer(df['package']),
'package_code': chainer(df['package_code'])})

print(res)

order_id order_date package package_code
0 1 20/5/2018 p1 #111
0 1 20/5/2018 p2 #222
0 1 20/5/2018 p3 #333
1 3 22/5/2018 p4 #444
2 7 23/5/2018 p5 #555
2 7 23/5/2018 p6 #666

Split delimited strings in multiple columns and separate them into rows

We may do this in an easier way if we make the delimiter same

library(dplyr)
library(tidyr)
library(stringr)
to_expand %>%
mutate(first = str_replace(first, "~", "|")) %>%
separate_rows(first, second, sep = "\\|")
# A tibble: 2 x 2
first second
<chr> <chr>
1 a 1~2~3
2 b 4~5~6

Split column into multiple columns when a row starts with a string

try this:

pd.concat([sub.reset_index(drop=True) for _, sub in df.groupby(
df.Group.str.contains(r'^Group\s+123').cumsum())], axis=1)
>>>

Group Group Group
0 Group 123 nv-1 Group 123 mt-d2 Group 123 id-01
1 a, v b, v n,m
2 s,b NaN x, y
3 y, i NaN z, m
4 NaN NaN l,b

Split (explode) pandas dataframe string entry to separate rows

How about something like this:

In [55]: pd.concat([Series(row['var2'], row['var1'].split(','))              
for _, row in a.iterrows()]).reset_index()
Out[55]:
index 0
0 a 1
1 b 1
2 c 1
3 d 2
4 e 2
5 f 2

Then you just have to rename the columns

Explode each row into multiple rows by splitting a column of a given computed range

try:

=INDEX(QUERY(SPLIT(FLATTEN(A1:A&"×"&SPLIT(B1:B, ", ", )&"×"&C1:C), "×"),
"where Col3 is not null"))

Sample Image



Related Topics



Leave a reply



Submit