How to Separate String into Different Columns

How to split a dataframe string column into two columns?

There might be a better way, but this here's one approach:

                            row
0 00000 UNITED STATES
1 01000 ALABAMA
2 01001 Autauga County, AL
3 01003 Baldwin County, AL
4 01005 Barbour County, AL
df = pd.DataFrame(df.row.str.split(' ',1).tolist(),
columns = ['fips','row'])
   fips                 row
0 00000 UNITED STATES
1 01000 ALABAMA
2 01001 Autauga County, AL
3 01003 Baldwin County, AL
4 01005 Barbour County, AL

How to split two strings into different columns in Python with Pandas?

The key here is to include the parameter expand=True in your str.split() to expand the split strings into separate columns.

Type it like this:

df[['First String','Second String']] = df['Full String'].str.split(expand=True)

Output:

    Full String First String Second String
0 Orange Juice Orange Juice
1 Pink Bird Pink Bird
2 Blue Ball Blue Ball
3 Green Tea Green Tea
4 Yellow Sun Yellow Sun

how to separate string into different columns?

Instead of using split function there is a function called ParseName which returns the specified part of the object which spilts the string delimated by .
Please go through the ParseName link which helped me in writing this query

Declare @Sample Table
(MachineName varchar(max))

Insert into @Sample
values
('Ab bb zecos'),('a Zeng')


SELECT
Reverse(ParseName(Replace(Reverse(MachineName), ' ', '.'), 1)) As [M1]
, Reverse(ParseName(Replace(Reverse(MachineName), ' ', '.'), 2)) As [M2]
, Reverse(ParseName(Replace(Reverse(MachineName), ' ', '.'), 3)) As [M3]

FROM (Select MachineName from @Sample
) As [x]

How to split a string into different columns in python pandas

Use str.slit with expand:

df.lineup.str.split('FLEX|CPT',expand=True)

How to split a string column into two columns with a 'variable' delimiter?

Use Series.str.split with the regex \s+\.+\s+, which splits by 1+ spaces, 1+ periods, 1+ spaces:

df = pd.DataFrame({'A': ['Mayor ............... Paul Jones', 'Senator ................. Billy Twister', 'Congress Rep. .......... Chris Rock', 'Chief of Staff ....... Tony Allen']})

df[['Title', 'Name']] = df['A'].str.split('\s+\.+\s+', expand=True)

# A Title Name
# 0 Mayor ............... Paul Jones Mayor Paul Jones
# 1 Senator ................. Billy Twister Senator Billy Twister
# 2 Congress Rep. .......... Chris Rock Congress Rep. Chris Rock
# 3 Chief of Staff ....... Tony Allen Chief of Staff Tony Allen

How do I split a string into several columns in a dataframe with pandas Python?

The str.split method has an expand argument:

>>> df['string'].str.split(',', expand=True)
0 1 2
0 astring isa string
1 another string la
2 123 232 another
>>>

With column names:

>>> df['string'].str.split(',', expand=True).rename(columns = lambda x: "string"+str(x+1))
string1 string2 string3
0 astring isa string
1 another string la
2 123 232 another

Much neater with Python >= 3.6 f-strings:

>>> (df['string'].str.split(',', expand=True)
... .rename(columns=lambda x: f"string_{x+1}"))
string_1 string_2 string_3
0 astring isa string
1 another string la
2 123 232 another

How to split a column into multiple (non equal) columns in R

We could use cSplit from splitstackshape

library(splitstackshape)
cSplit(DF, "Col1",",")

-output

cSplit(DF, "Col1",",")
Col1_1 Col1_2 Col1_3 Col1_4
1: a b c <NA>
2: a b <NA> <NA>
3: a b c d

Split string into multiple columns

Transforming these strings into jsonb objects is relatively straightforward:

select 
split_part(id, ':', 1) as id,
date,
jsonb_object_agg(split_part(param, '=', 1), split_part(param, '=', 2)) as params
from my_table
cross join unnest(string_to_array(split_part(id, ':', 2), '&')) as param
group by id, date;

Now you can use the solution described in Flatten aggregated key/value pairs from a JSONB field?

Alternatively, if you know the number and names of the parameters, this query is simpler and works well:

select
id,
date,
params->>'type' as type,
params->>'country' as country,
params->>'quality' as quality
from (
select
split_part(id, ':', 1) as id,
date,
jsonb_object_agg(split_part(param, '=', 1), split_part(param, '=', 2)) as params
from my_table
cross join unnest(string_to_array(split_part(id, ':', 2), '&')) as param
group by id, date
) s;

Test it in Db<>fiddle.

In Postgres 14+ you can replace unnest(string_to_array(...)) with string_to_table(...).

Splitting a string column with unequal size into multiple columns using R

This is a good occasion to make use of extra = merge argument of separate:

library(dplyr)
df %>%
separate(str, c('A', 'B', 'C'), sep= ";", extra = 'merge')
  no    A     B     C
1 1 M 12 M 13 <NA>
2 2 M 24 <NA> <NA>
3 3 <NA> <NA> <NA>
4 4 C 12 C 50 C 78

How to split a string into multiple columns by a given pattern?

If the strings are always in that same format, the following regular expression should work well:

library(stringr)
x <- "\r\n \r\n How to get a confirm ticket?\r\n \r\n I want to get a tatkal ticket confirm ..."
str_split(x, "(\r\n\\s*)+", simplify = TRUE)[, -1, drop = FALSE]
[,1] [,2]
[1,] "How to get a confirm ticket?" "I want to get a tatkal ticket confirm ..."

If your data actually comes from a table in a text file or from a web page, there are probably more convenient options.



Related Topics



Leave a reply



Submit