Split Column At Delimiter in Data Frame

Splitting a pandas dataframe column by delimiter

Use vectoried str.split with expand=True:

In [42]:
df[['V','allele']] = df['V'].str.split('-',expand=True)
df

Out[42]:
      ID    Prob      V allele
0   3009  1.0000  IGHV7   B*01
1    129  1.0000  IGHV7   B*01
2    119  0.8000  IGHV6   A*01
3    120  0.8056   GHV6   A*01
4    121  0.9000  IGHV6   A*01
5    122  0.8050  IGHV6   A*01
6    130  1.0000  IGHV4   L*03
7   3014  1.0000  IGHV4   L*03
8    266  0.9970  IGHV5   A*01
9    849  0.4010  IGHV5   A*04
10   174  1.0000  IGHV6   A*02
11   844  1.0000  IGHV6   A*02

Split column at delimiter in data frame

@Taesung Shin is right, but then just some more magic to make it into a data.frame.
I added a "x|y" line to avoid ambiguities:

df <- data.frame(ID=11:13, FOO=c('a|b','b|c','x|y'))
foo <- data.frame(do.call('rbind', strsplit(as.character(df$FOO),'|',fixed=TRUE)))

Or, if you want to replace the columns in the existing data.frame:

within(df, FOO<-data.frame(do.call('rbind', strsplit(as.character(FOO), '|', fixed=TRUE))))

Which produces:

  ID FOO.X1 FOO.X2
1 11      a      b
2 12      b      c
3 13      x      y

Split column in several columns by delimiter '\' in pandas

It looks like your file is tab-delimited, because of the "\t". This may work

pd.read_csv('file.txt', sep='\t', skiprows=8)

How to split a dataframe string column into two columns?

There might be a better way, but this here's one approach:

                            row
    0       00000 UNITED STATES
    1             01000 ALABAMA
    2  01001 Autauga County, AL
    3  01003 Baldwin County, AL
    4  01005 Barbour County, AL

df = pd.DataFrame(df.row.str.split(' ',1).tolist(),
                                 columns = ['fips','row'])

   fips                 row
0  00000       UNITED STATES
1  01000             ALABAMA
2  01001  Autauga County, AL
3  01003  Baldwin County, AL
4  01005  Barbour County, AL

Split dataframe column with second column as delimiter

Try apply.

bigdata[['title', 'location']]=bigdata.apply(func=lambda row: row['title_location'].split(row['delimiter']), axis=1, result_type="expand")

How to Split a column into two by comma delimiter, and put a value without comma in second column and not in first?

We can try using str.extract here:

df["Location"] = df["Origin"].str.extract(r'(.*),')
df["Country"] = df["Origin"].str.extract(r'(\w+(?: \w+)*)$')

Split column in a Pandas Dataframe into n number of columns

Let's try it with stack + str.split + unstack + join.

The idea is to split each column by ^ and expand the split characters into a separate column. stack helps us do a single str.split on a Series object and unstack creates a DataFrame with the same index as the original.

tmp = df.stack().str.split('^', expand=True).unstack(level=1).sort_index(level=1, axis=1)
tmp.columns = [f'{y}_{x+1}' for x, y in tmp.columns]
out = df.join(tmp).dropna(how='all', axis=1).fillna('')

Output:

  column_name_1 column_name_2 column_name_1_1 column_name_1_2 column_name_1_3 column_name_1_4 column_name_2_1 column_name_2_2  
0       a^b^c^d             j               a               b               c               d               j                  
1         e^f^g           k^l               e               f               g                               k               l  
2           h^i             m               h               i                                               m

Split Dataframe column on delimiter when number of strings to split is not definite

You can covert your string to list with string .split() inside .map() method:

df['B'] = df['B'].map(lambda x: x.split(';'))

And then use .explode():

df.explode('B').reset_index(drop=True)

How to split a Pandas DataFrame column into multiple columns if the column is a string of varying length?

You can try using str.rsplit:

Splits string around given separator/delimiter, starting from the
right.

df['Col_1'].str.rsplit(' ', 2, expand=True)

Output:

             0  1  2
0        Hello  X  Y
1  Hello world  Q  R
2           Hi  S  T

As a full dataframe:

df['Col_1'].str.rsplit(' ', 2, expand=True).add_prefix('nCol_').join(df)

Output:

        nCol_0 nCol_1 nCol_2            Col_1 Col_2
0        Hello      X      Y        Hello X Y     A
1  Hello world      Q      R  Hello world Q R     B
2           Hi      S      T           Hi S T     C

Splitting a Dataframe column on Delimiter and retaining all other columns

There is an easier way:

In [11]: df[['Work Package','Task']] = df['Summary'].str.split(':',1, expand=True)

In [12]: df
Out[12]:
         Key                                     Summary                                             Status Description  Updated  \
0  XTBOW-310  Data Mgmt: Product Assesment and Selection  In Analysis  - To establish a provider for the...  2017-05-26      NaN
1  XTBOW-420       Data Mgmt: Vendor > CIBC Implemention  NaN  - Integrate with Vendor to fetch Corporat...  2017-05-19      NaN
2  XTBOW-421             Trade Migration: PVs and Greeks  NaN  - PVs and Greeks regression gap analysis ...  2017-05-19      NaN
3  XTBOW-422       Trade Migration: Reports (XTC vs XT2)  NaN                                           ...  2017-05-19      NaN

      Work Package                              Task
0        Data Mgmt   Product Assesment and Selection
1        Data Mgmt        Vendor > CIBC Implemention
2  Trade Migration                    PVs and Greeks
3  Trade Migration              Reports (XTC vs XT2)

Split Column At Delimiter in Data Frame

Splitting a pandas dataframe column by delimiter

Split column at delimiter in data frame

Split column in several columns by delimiter '\' in pandas

How to split a dataframe string column into two columns?

Split dataframe column with second column as delimiter

How to Split a column into two by comma delimiter, and put a value without comma in second column and not in first?

Split column in a Pandas Dataframe into n number of columns

Split Dataframe column on delimiter when number of strings to split is not definite

How to split a Pandas DataFrame column into multiple columns if the column is a string of varying length?

Splitting a Dataframe column on Delimiter and retaining all other columns

Related Topics

Leave a reply