Splitting a pandas dataframe column by delimiter
Use vectoried str.split
with expand=True
:
In [42]:
df[['V','allele']] = df['V'].str.split('-',expand=True)
df
Out[42]:
ID Prob V allele
0 3009 1.0000 IGHV7 B*01
1 129 1.0000 IGHV7 B*01
2 119 0.8000 IGHV6 A*01
3 120 0.8056 GHV6 A*01
4 121 0.9000 IGHV6 A*01
5 122 0.8050 IGHV6 A*01
6 130 1.0000 IGHV4 L*03
7 3014 1.0000 IGHV4 L*03
8 266 0.9970 IGHV5 A*01
9 849 0.4010 IGHV5 A*04
10 174 1.0000 IGHV6 A*02
11 844 1.0000 IGHV6 A*02
Split column at delimiter in data frame
@Taesung Shin is right, but then just some more magic to make it into a data.frame
.
I added a "x|y" line to avoid ambiguities:
df <- data.frame(ID=11:13, FOO=c('a|b','b|c','x|y'))
foo <- data.frame(do.call('rbind', strsplit(as.character(df$FOO),'|',fixed=TRUE)))
Or, if you want to replace the columns in the existing data.frame:
within(df, FOO<-data.frame(do.call('rbind', strsplit(as.character(FOO), '|', fixed=TRUE))))
Which produces:
ID FOO.X1 FOO.X2
1 11 a b
2 12 b c
3 13 x y
Split column in several columns by delimiter '\' in pandas
It looks like your file is tab-delimited, because of the "\t". This may work
pd.read_csv('file.txt', sep='\t', skiprows=8)
How to split a dataframe string column into two columns?
There might be a better way, but this here's one approach:
row
0 00000 UNITED STATES
1 01000 ALABAMA
2 01001 Autauga County, AL
3 01003 Baldwin County, AL
4 01005 Barbour County, AL
df = pd.DataFrame(df.row.str.split(' ',1).tolist(),
columns = ['fips','row'])
fips row
0 00000 UNITED STATES
1 01000 ALABAMA
2 01001 Autauga County, AL
3 01003 Baldwin County, AL
4 01005 Barbour County, AL
Split dataframe column with second column as delimiter
Try apply
.
bigdata[['title', 'location']]=bigdata.apply(func=lambda row: row['title_location'].split(row['delimiter']), axis=1, result_type="expand")
How to Split a column into two by comma delimiter, and put a value without comma in second column and not in first?
We can try using str.extract
here:
df["Location"] = df["Origin"].str.extract(r'(.*),')
df["Country"] = df["Origin"].str.extract(r'(\w+(?: \w+)*)$')
Split column in a Pandas Dataframe into n number of columns
Let's try it with stack
+ str.split
+ unstack
+ join
.
The idea is to split each column by ^
and expand the split characters into a separate column. stack
helps us do a single str.split
on a Series object and unstack
creates a DataFrame with the same index as the original.
tmp = df.stack().str.split('^', expand=True).unstack(level=1).sort_index(level=1, axis=1)
tmp.columns = [f'{y}_{x+1}' for x, y in tmp.columns]
out = df.join(tmp).dropna(how='all', axis=1).fillna('')
Output:
column_name_1 column_name_2 column_name_1_1 column_name_1_2 column_name_1_3 column_name_1_4 column_name_2_1 column_name_2_2
0 a^b^c^d j a b c d j
1 e^f^g k^l e f g k l
2 h^i m h i m
Split Dataframe column on delimiter when number of strings to split is not definite
You can covert your string to list with string .split()
inside .map()
method:
df['B'] = df['B'].map(lambda x: x.split(';'))
And then use .explode()
:
df.explode('B').reset_index(drop=True)
How to split a Pandas DataFrame column into multiple columns if the column is a string of varying length?
You can try using str.rsplit
:
Splits string around given separator/delimiter, starting from the
right.
df['Col_1'].str.rsplit(' ', 2, expand=True)
Output:
0 1 2
0 Hello X Y
1 Hello world Q R
2 Hi S T
As a full dataframe:
df['Col_1'].str.rsplit(' ', 2, expand=True).add_prefix('nCol_').join(df)
Output:
nCol_0 nCol_1 nCol_2 Col_1 Col_2
0 Hello X Y Hello X Y A
1 Hello world Q R Hello world Q R B
2 Hi S T Hi S T C
Splitting a Dataframe column on Delimiter and retaining all other columns
There is an easier way:
In [11]: df[['Work Package','Task']] = df['Summary'].str.split(':',1, expand=True)
In [12]: df
Out[12]:
Key Summary Status Description Updated \
0 XTBOW-310 Data Mgmt: Product Assesment and Selection In Analysis - To establish a provider for the... 2017-05-26 NaN
1 XTBOW-420 Data Mgmt: Vendor > CIBC Implemention NaN - Integrate with Vendor to fetch Corporat... 2017-05-19 NaN
2 XTBOW-421 Trade Migration: PVs and Greeks NaN - PVs and Greeks regression gap analysis ... 2017-05-19 NaN
3 XTBOW-422 Trade Migration: Reports (XTC vs XT2) NaN ... 2017-05-19 NaN
Work Package Task
0 Data Mgmt Product Assesment and Selection
1 Data Mgmt Vendor > CIBC Implemention
2 Trade Migration PVs and Greeks
3 Trade Migration Reports (XTC vs XT2)
Related Topics
Saving Output of Confusionmatrix as a .Csv Table
How to Reshape Data from Long to Wide Format
Remove Rows With All or Some Nas (Missing Values) in Data.Frame
Split Column At Delimiter in Data Frame
How to Remove All Duplicates So That None Are Left in a Data Frame
How to Read Multiple (Excel) Files into R
Cluster Analysis in R: Determine the Optimal Number of Clusters
Filter Multiple Values on a String Column in Dplyr
Sum Rows in Data.Frame or Matrix
Create Stacked Barplot Where Each Stack Is Scaled to Sum to 100%
How to Combine Multiple Variable Data to a Single Variable Data
Reshaping Data.Frame from Wide to Long Format
Calculate Group Mean, Sum, or Other Summary Stats. and Assign Column to Original Data
How to Debug "Contrasts Can Be Applied Only to Factors With 2 or More Levels" Error