Separate a Row of Strings into Separate Rows

Split (explode) pandas dataframe string entry to separate rows

How about something like this:

In [55]: pd.concat([Series(row['var2'], row['var1'].split(','))              
                    for _, row in a.iterrows()]).reset_index()
Out[55]: 
  index  0
0     a  1
1     b  1
2     c  1
3     d  2
4     e  2
5     f  2

Then you just have to rename the columns

Split delimited strings in a column and insert as new rows

Here is another way of doing it..

df <- read.table(textConnection("1|a,b,c\n2|a,c\n3|b,d\n4|e,f"), header = F, sep = "|", stringsAsFactors = F)

df
##   V1    V2
## 1  1 a,b,c
## 2  2   a,c
## 3  3   b,d
## 4  4   e,f

s <- strsplit(df$V2, split = ",")
data.frame(V1 = rep(df$V1, sapply(s, length)), V2 = unlist(s))
##   V1 V2
## 1  1  a
## 2  1  b
## 3  1  c
## 4  2  a
## 5  2  c
## 6  3  b
## 7  3  d
## 8  4  e
## 9  4  f

Split pandas dataframe string into separate rows

Try with explode

df=df_input.assign(var2=df_input.var2.str.split('/')).explode('var2')
  var1 var2  var3
0    A    x  abc1
0    A    y  abc1
0    A    z  abc1
1    B   xx  abc2
1    B   yy  abc2
2    c   zz  abcd

Then groupby + shift

df.var1=df.groupby(level=0).var2.shift().fillna(df.var1)
df
  var1 var2  var3
0    A    x  abc1
0    x    y  abc1
0    y    z  abc1
1    B   xx  abc2
1   xx   yy  abc2
2    c   zz  abcd

Split delimited strings in multiple columns and separate them into rows

We may do this in an easier way if we make the delimiter same

library(dplyr)
library(tidyr)
library(stringr)
to_expand %>% 
    mutate(first = str_replace(first, "~", "|")) %>% 
    separate_rows(first, second, sep = "\\|")
# A tibble: 2 x 2
  first second
  <chr> <chr> 
1 a     1~2~3 
2 b     4~5~6

Split strings into separate rows excluding some pattern matches

We could do this in base R with strsplit by splitting the 'IV' column at the , while SKIPping the characters inside the parentheses, and then replicate the rows if the data by the lengths of the list created with strsplit

lst1 <-  strsplit(df1$IV, "\\([^)]+(*SKIP)(*FAIL)|,\\s*", perl = TRUE)
df2 <- transform(df1[setdiff(names(df1), "IV")][rep(seq_len(nrow(df1)), 
        lengths(lst1)),], IV = unlist(lst1))[names(df1)]

-output

> df2
  Article.Title             Sample                                                                         IV Moderator Mediator          DV
1  Random title Sample information                                                                Union voice      <NA>     <NA> Performance
2  Random title Sample information HRM practices (participation, teams, incentives, development, recruitment)      <NA>     <NA> Performance
3  Random title Sample information                                                          implict contracts      <NA>     <NA> Performance
4  Random title Sample information                                                              Crisis impact      <NA>     <NA> Performance
5  Random title Sample information                                        dominant individual or family owner      <NA>     <NA> Performance
6  Random title Sample information                                     no dominant individual or family owner      <NA>     <NA> Performance
7  Random title Sample information                                                              market growth      <NA>     <NA> Performance
8  Random title Sample information                                                           no market growth      <NA>     <NA> Performance

Or use the same regex in separate_rows (as in the comments)

library(tidyr)
separate_rows(df1, IV, sep = "\\([^)]+(*SKIP)(*FAIL)|,\\s*")

-output

# A tibble: 9 × 6
  Article.Title Sample             IV                                                                           Moderator Mediator DV         
  <chr>         <chr>              <chr>                                                                        <chr>     <chr>    <chr>      
1 Random title  Sample information "Union voice"                                                                <NA>      <NA>     Performance
2 Random title  Sample information "HRM practices (participation, teams, incentives, development, recruitment)" <NA>      <NA>     Performance
3 Random title  Sample information "implict contracts"                                                          <NA>      <NA>     Performance
4 Random title  Sample information "Crisis impact"                                                              <NA>      <NA>     Performance
5 Random title  Sample information "dominant individual or family owner"                                        <NA>      <NA>     Performance
6 Random title  Sample information "no dominant individual or family owner"                                     <NA>      <NA>     Performance
7 Random title  Sample information "market growth"                                                              <NA>      <NA>     Performance
8 Random title  Sample information "no market growth"                                                           <NA>      <NA>     Performance
9 Random title  Sample information ""                                                                           <NA>      <NA>     Performance

Split pandas dataframe column string with multiple values into separate rows

Here is one way from join + explode then shift

df_input['New']=df_input[['var1','var2']].agg('/'.join,1).str.split('/')
df=df_input.explode('New')
df['New2']=df.groupby(level=0).New.shift(-1)
df=df.dropna(subset=['New2'],axis=0)
df
   var1   var2  var3 New New2
0  A/A1  x/y/z  abc1   A   A1
0  A/A1  x/y/z  abc1  A1    x
0  A/A1  x/y/z  abc1   x    y
0  A/A1  x/y/z  abc1   y    z
1     B  xx/yy  abc2   B   xx
1     B  xx/yy  abc2  xx   yy
2     c     zz  abcd   c   zz

Splitting a string into new rows in R

Try the cSplit function (as you already using @Anandas package). Note that is will return a data.table object, so make sure you have this package installed. You can revert back to data.frame (if you want to) by doing something like setDF(df2)

library(splitstackshape)
df2 <- cSplit(df1, "Item.Code", sep = "/", direction = "long")
df2
#     Country Region Molecule      Item.Code
#  1:     IND     NA    PB102    FR206985511
#  2:    THAI     AP    PB103      BA-107603 
#  3:    THAI     AP    PB103     F000113361 
#  4:    THAI     AP    PB103         107603
#  5:    LUXE     NA    PB105        1012701 
#  6:    LUXE     NA    PB105    SGP-1012701 
#  7:    LUXE     NA    PB105     F041701000
#  8:     IND     AP    PB106    AU206985211 
#  9:     IND     AP    PB106  CA-F206985211
# 10:    THAI     HP    PB107     F034702000 
# 11:    THAI     HP    PB107        1010701 
# 12:    THAI     HP    PB107    SGP-1010701
# 13:    BANG     NA    PB108     F000007970
# 14:    BANG     NA    PB108          25781
# 15:    BANG     NA    PB108       20009021

Split a string in R into rows and columns

We could use separate_rows to split the column created at the space before the digit, then separate into two columns at the first spaces

library(dplyr)
library(tidyr)
tibble(col1 = rows) %>%
     separate_rows(col1, sep="\\s+(?=[0-9])") %>%
     separate(col1, into = c("Code", "Item"), extra = 'merge')
# A tibble: 4 x 2
#  Code  Item                     
#  <chr> <chr>                    
#1 70150 Markers, Times, Places   
#2 72588 Times, Places, Things    
#3 51256 Items, Shelves, Cats     
#4 99201 Widget, Places, Locations

Split one row into multiple rows based on comma-separated string column

Use unnest on the array returned by split.

SELECT a,split_b 
FROM tbl
CROSS JOIN UNNEST(SPLIT(b,',')) AS t (split_b)