Split Column At Delimiter in Data Frame

Splitting a pandas dataframe column by delimiter

Use vectoried str.split with expand=True:

In [42]:
df[['V','allele']] = df['V'].str.split('-',expand=True)
df

Out[42]:
ID Prob V allele
0 3009 1.0000 IGHV7 B*01
1 129 1.0000 IGHV7 B*01
2 119 0.8000 IGHV6 A*01
3 120 0.8056 GHV6 A*01
4 121 0.9000 IGHV6 A*01
5 122 0.8050 IGHV6 A*01
6 130 1.0000 IGHV4 L*03
7 3014 1.0000 IGHV4 L*03
8 266 0.9970 IGHV5 A*01
9 849 0.4010 IGHV5 A*04
10 174 1.0000 IGHV6 A*02
11 844 1.0000 IGHV6 A*02

Split column at delimiter in data frame

@Taesung Shin is right, but then just some more magic to make it into a data.frame.
I added a "x|y" line to avoid ambiguities:

df <- data.frame(ID=11:13, FOO=c('a|b','b|c','x|y'))
foo <- data.frame(do.call('rbind', strsplit(as.character(df$FOO),'|',fixed=TRUE)))

Or, if you want to replace the columns in the existing data.frame:

within(df, FOO<-data.frame(do.call('rbind', strsplit(as.character(FOO), '|', fixed=TRUE))))

Which produces:

  ID FOO.X1 FOO.X2
1 11 a b
2 12 b c
3 13 x y

Split column in several columns by delimiter '\' in pandas

It looks like your file is tab-delimited, because of the "\t". This may work

pd.read_csv('file.txt', sep='\t', skiprows=8)

How to split a dataframe string column into two columns?

There might be a better way, but this here's one approach:

                            row
0 00000 UNITED STATES
1 01000 ALABAMA
2 01001 Autauga County, AL
3 01003 Baldwin County, AL
4 01005 Barbour County, AL
df = pd.DataFrame(df.row.str.split(' ',1).tolist(),
columns = ['fips','row'])
   fips                 row
0 00000 UNITED STATES
1 01000 ALABAMA
2 01001 Autauga County, AL
3 01003 Baldwin County, AL
4 01005 Barbour County, AL

Split dataframe column with second column as delimiter

Try apply.

bigdata[['title', 'location']]=bigdata.apply(func=lambda row: row['title_location'].split(row['delimiter']), axis=1, result_type="expand")

How to Split a column into two by comma delimiter, and put a value without comma in second column and not in first?

We can try using str.extract here:

df["Location"] = df["Origin"].str.extract(r'(.*),')
df["Country"] = df["Origin"].str.extract(r'(\w+(?: \w+)*)$')

Split column in a Pandas Dataframe into n number of columns

Let's try it with stack + str.split + unstack + join.

The idea is to split each column by ^ and expand the split characters into a separate column. stack helps us do a single str.split on a Series object and unstack creates a DataFrame with the same index as the original.

tmp = df.stack().str.split('^', expand=True).unstack(level=1).sort_index(level=1, axis=1)
tmp.columns = [f'{y}_{x+1}' for x, y in tmp.columns]
out = df.join(tmp).dropna(how='all', axis=1).fillna('')

Output:

  column_name_1 column_name_2 column_name_1_1 column_name_1_2 column_name_1_3 column_name_1_4 column_name_2_1 column_name_2_2  
0 a^b^c^d j a b c d j
1 e^f^g k^l e f g k l
2 h^i m h i m

Split Dataframe column on delimiter when number of strings to split is not definite

You can covert your string to list with string .split() inside .map() method:

df['B'] = df['B'].map(lambda x: x.split(';'))

And then use .explode():

df.explode('B').reset_index(drop=True)

How to split a Pandas DataFrame column into multiple columns if the column is a string of varying length?

You can try using str.rsplit:

Splits string around given separator/delimiter, starting from the
right.

df['Col_1'].str.rsplit(' ', 2, expand=True)

Output:

             0  1  2
0 Hello X Y
1 Hello world Q R
2 Hi S T

As a full dataframe:

df['Col_1'].str.rsplit(' ', 2, expand=True).add_prefix('nCol_').join(df)

Output:

        nCol_0 nCol_1 nCol_2            Col_1 Col_2
0 Hello X Y Hello X Y A
1 Hello world Q R Hello world Q R B
2 Hi S T Hi S T C

Splitting a Dataframe column on Delimiter and retaining all other columns

There is an easier way:

In [11]: df[['Work Package','Task']] = df['Summary'].str.split(':',1, expand=True)

In [12]: df
Out[12]:
Key Summary Status Description Updated \
0 XTBOW-310 Data Mgmt: Product Assesment and Selection In Analysis - To establish a provider for the... 2017-05-26 NaN
1 XTBOW-420 Data Mgmt: Vendor > CIBC Implemention NaN - Integrate with Vendor to fetch Corporat... 2017-05-19 NaN
2 XTBOW-421 Trade Migration: PVs and Greeks NaN - PVs and Greeks regression gap analysis ... 2017-05-19 NaN
3 XTBOW-422 Trade Migration: Reports (XTC vs XT2) NaN ... 2017-05-19 NaN

Work Package Task
0 Data Mgmt Product Assesment and Selection
1 Data Mgmt Vendor > CIBC Implemention
2 Trade Migration PVs and Greeks
3 Trade Migration Reports (XTC vs XT2)


Related Topics



Leave a reply



Submit