How to split a dataframe string column into two columns?
There might be a better way, but this here's one approach:
row
0 00000 UNITED STATES
1 01000 ALABAMA
2 01001 Autauga County, AL
3 01003 Baldwin County, AL
4 01005 Barbour County, AL
df = pd.DataFrame(df.row.str.split(' ',1).tolist(),
columns = ['fips','row'])
fips row
0 00000 UNITED STATES
1 01000 ALABAMA
2 01001 Autauga County, AL
3 01003 Baldwin County, AL
4 01005 Barbour County, AL
How to split two strings into different columns in Python with Pandas?
The key here is to include the parameter expand=True
in your str.split()
to expand the split strings into separate columns.
Type it like this:
df[['First String','Second String']] = df['Full String'].str.split(expand=True)
Output:
Full String First String Second String
0 Orange Juice Orange Juice
1 Pink Bird Pink Bird
2 Blue Ball Blue Ball
3 Green Tea Green Tea
4 Yellow Sun Yellow Sun
how to separate string into different columns?
Instead of using split function there is a function called ParseName which returns the specified part of the object which spilts the string delimated by .
Please go through the ParseName link which helped me in writing this query
Declare @Sample Table
(MachineName varchar(max))
Insert into @Sample
values
('Ab bb zecos'),('a Zeng')
SELECT
Reverse(ParseName(Replace(Reverse(MachineName), ' ', '.'), 1)) As [M1]
, Reverse(ParseName(Replace(Reverse(MachineName), ' ', '.'), 2)) As [M2]
, Reverse(ParseName(Replace(Reverse(MachineName), ' ', '.'), 3)) As [M3]
FROM (Select MachineName from @Sample
) As [x]
How to split a string into different columns in python pandas
Use str.slit
with expand
:
df.lineup.str.split('FLEX|CPT',expand=True)
How to split a string column into two columns with a 'variable' delimiter?
Use Series.str.split
with the regex \s+\.+\s+
, which splits by 1+ spaces, 1+ periods, 1+ spaces:
df = pd.DataFrame({'A': ['Mayor ............... Paul Jones', 'Senator ................. Billy Twister', 'Congress Rep. .......... Chris Rock', 'Chief of Staff ....... Tony Allen']})
df[['Title', 'Name']] = df['A'].str.split('\s+\.+\s+', expand=True)
# A Title Name
# 0 Mayor ............... Paul Jones Mayor Paul Jones
# 1 Senator ................. Billy Twister Senator Billy Twister
# 2 Congress Rep. .......... Chris Rock Congress Rep. Chris Rock
# 3 Chief of Staff ....... Tony Allen Chief of Staff Tony Allen
How do I split a string into several columns in a dataframe with pandas Python?
The str.split
method has an expand
argument:
>>> df['string'].str.split(',', expand=True)
0 1 2
0 astring isa string
1 another string la
2 123 232 another
>>>
With column names:
>>> df['string'].str.split(',', expand=True).rename(columns = lambda x: "string"+str(x+1))
string1 string2 string3
0 astring isa string
1 another string la
2 123 232 another
Much neater with Python >= 3.6 f-strings:
>>> (df['string'].str.split(',', expand=True)
... .rename(columns=lambda x: f"string_{x+1}"))
string_1 string_2 string_3
0 astring isa string
1 another string la
2 123 232 another
How to split a column into multiple (non equal) columns in R
We could use cSplit
from splitstackshape
library(splitstackshape)
cSplit(DF, "Col1",",")
-output
cSplit(DF, "Col1",",")
Col1_1 Col1_2 Col1_3 Col1_4
1: a b c <NA>
2: a b <NA> <NA>
3: a b c d
Split string into multiple columns
Transforming these strings into jsonb objects is relatively straightforward:
select
split_part(id, ':', 1) as id,
date,
jsonb_object_agg(split_part(param, '=', 1), split_part(param, '=', 2)) as params
from my_table
cross join unnest(string_to_array(split_part(id, ':', 2), '&')) as param
group by id, date;
Now you can use the solution described in Flatten aggregated key/value pairs from a JSONB field?
Alternatively, if you know the number and names of the parameters, this query is simpler and works well:
select
id,
date,
params->>'type' as type,
params->>'country' as country,
params->>'quality' as quality
from (
select
split_part(id, ':', 1) as id,
date,
jsonb_object_agg(split_part(param, '=', 1), split_part(param, '=', 2)) as params
from my_table
cross join unnest(string_to_array(split_part(id, ':', 2), '&')) as param
group by id, date
) s;
Test it in Db<>fiddle.
In Postgres 14+ you can replace unnest(string_to_array(...))
with string_to_table(...)
.
Splitting a string column with unequal size into multiple columns using R
This is a good occasion to make use of extra = merge
argument of separate
:
library(dplyr)
df %>%
separate(str, c('A', 'B', 'C'), sep= ";", extra = 'merge')
no A B C
1 1 M 12 M 13 <NA>
2 2 M 24 <NA> <NA>
3 3 <NA> <NA> <NA>
4 4 C 12 C 50 C 78
How to split a string into multiple columns by a given pattern?
If the strings are always in that same format, the following regular expression should work well:
library(stringr)
x <- "\r\n \r\n How to get a confirm ticket?\r\n \r\n I want to get a tatkal ticket confirm ..."
str_split(x, "(\r\n\\s*)+", simplify = TRUE)[, -1, drop = FALSE]
[,1] [,2]
[1,] "How to get a confirm ticket?" "I want to get a tatkal ticket confirm ..."
If your data actually comes from a table in a text file or from a web page, there are probably more convenient options.
Related Topics
Closing Connection When Using Dapper
How to Debug Ora-01775: Looping Chain of Synonyms
MySQL #1140 - Mixing of Group Columns
Alternate of Lead Lag Function in SQL Server 2008
Get Topn of All Groups After Group by Using Spark Dataframe
How to Flush Output from Pl/SQL in Oracle
Search for "Whole Word Match" with SQL Server Like Pattern
How to Export Image Field to File
How to Get a SQL Row_Number Equivalent for a Spark Rdd
SQL Server 2008 Paging Methods
Ssis Source Format Implicit Conversion for Datetime
Does Oracle Store Trailing Zeroes for Number Data Type
SQL Error "Ora-01722: Invalid Number"
Regular Expression in Postgresql Like Clause