Split data frame string column into multiple columns
Use stringr::str_split_fixed
library(stringr)
str_split_fixed(before$type, "_and_", 2)
How to split a dataframe column into two columns
read.table(text=df$X1, sep=':', fill=T, h=F, dec = '/')
V1 V2
1 NA
2 1.0 0.82
3 1.1 1.995
4 0.1 1.146
5 NA
6 1.1 1.995
If you want columns in respective data.types:
type.convert(read.table(text=df$X1, sep=':', fill=T, h=F, dec = '/'), as.is = TRUE)
V1 V2
1 NA NA
2 1.0 0.820
3 1.1 1.995
4 0.1 1.146
5 NA NA
6 1.1 1.995
df <- structure(list(X1 = c(NA, "1/0:0.82", "1/1:1.995", "0/1:1.146", NA,
"1/1:1.995")), class = "data.frame", row.names = c(NA, -6L))
How to split a column into multiple (non equal) columns in R
We could use cSplit
from splitstackshape
library(splitstackshape)
cSplit(DF, "Col1",",")
-output
cSplit(DF, "Col1",",")
Col1_1 Col1_2 Col1_3 Col1_4
1: a b c <NA>
2: a b <NA> <NA>
3: a b c d
split character column into multiple columns
If doing in tidyverse/dplyr pipe kinda syntax, you may use separate
from tidyr
in conjunction with stringr::str_count
which does exactly as you require.
df <- data.frame(a = c(1:4), b = c("bla", "word", "otherword", "nice"), c = c("one \n two \n three", "bla \n why \n morebla \n helpme", "bla \n bla", "bla"))
library(tidyverse)
df %>% separate(c, into = paste0('c', seq_len(max(str_count(df$c, '\n')+1))), sep = '\n', fill = 'right')
a b c1 c2 c3 c4
1 1 bla one two three <NA>
2 2 word bla why morebla helpme
3 3 otherword bla bla <NA> <NA>
4 4 nice bla <NA> <NA> <NA>
For doing it on list of data.frames, do it like this
df1 <- data.frame(a = c(1:4), b = c("bla", "word", "otherword", "nice"), c = c("one \n two \n three", "bla \n why \n morebla \n helpme", "bla \n bla", "bla"))
df2 <- data.frame(a = c(1:4), b = c("bla", "word", "otherword", "nice"), c = c("one \n two \n three", "bla \n why \n morebla \n helpme", "bla \n bla \n ghfdghf \n hdhdh \n hjgfj \n td", "bla"))
map(list(df1, df2), ~.x %>% separate(c, into = paste0('c', seq_len(max(str_count(.x$c, '\n')+1))), sep = '\n', fill = 'right'))
[[1]]
a b c1 c2 c3 c4
1 1 bla one two three <NA>
2 2 word bla why morebla helpme
3 3 otherword bla bla <NA> <NA>
4 4 nice bla <NA> <NA> <NA>
[[2]]
a b c1 c2 c3 c4 c5 c6
1 1 bla one two three <NA> <NA> <NA>
2 2 word bla why morebla helpme <NA> <NA>
3 3 otherword bla bla ghfdghf hdhdh hjgfj td
4 4 nice bla <NA> <NA> <NA> <NA> <NA>
Further Edit in view of revised question
- Use
map_dfr
instead
map_dfr(list(df1, df2), ~.x %>% separate(c, into = paste0('c', seq_len(max(str_count(.x$c, '\n')+1))), sep = '\n', fill = 'right'))
a b c1 c2 c3 c4 c5 c6
1 1 bla one two three <NA> <NA> <NA>
2 2 word bla why morebla helpme <NA> <NA>
3 3 otherword bla bla <NA> <NA> <NA> <NA>
4 4 nice bla <NA> <NA> <NA> <NA> <NA>
5 1 bla one two three <NA> <NA> <NA>
6 2 word bla why morebla helpme <NA> <NA>
7 3 otherword bla bla ghfdghf hdhdh hjgfj td
8 4 nice bla <NA> <NA> <NA> <NA> <NA>
- But I cannot see a reason why doing it on separate items of list and then r-binding instead of first r-binding and then simply doing it without
map*
df1 %>% rbind(df2) %>% separate(c, into = paste0('c', seq_len(max(str_count(.$c, '\n')+1))), sep = '\n', fill = 'right')
a b c1 c2 c3 c4 c5 c6
1 1 bla one two three <NA> <NA> <NA>
2 2 word bla why morebla helpme <NA> <NA>
3 3 otherword bla bla <NA> <NA> <NA> <NA>
4 4 nice bla <NA> <NA> <NA> <NA> <NA>
5 1 bla one two three <NA> <NA> <NA>
6 2 word bla why morebla helpme <NA> <NA>
7 3 otherword bla bla ghfdghf hdhdh hjgfj td
8 4 nice bla <NA> <NA> <NA> <NA> <NA>
Splitting a single column into multiple columns in R
A possible solution, based on tidyverse
:
library(tidyverse)
df %>%
filter(table != "_________________________________________________" ) %>%
mutate(table = str_trim(table)) %>%
separate(table, sep = "\\s+(?=\\d+)",
into = c("Characteristic", "Urban", "Rural", "Total"), fill = "right") %>%
filter(Characteristic != "") %>%
slice(-1)
#> # A tibble: 54 × 4
#> Characteristic Urban Rural Total
#> <chr> <chr> <chr> <chr>
#> 1 Electricity <NA> <NA> <NA>
#> 2 Yes 99.8 94.4 98.9
#> 3 No 0.2 5.6 1.1
#> 4 Total 100.0 100.0 100.0
#> 5 Source of drinking water <NA> <NA> <NA>
#> 6 Piped into residence 97.1 81.4 94.4
#> 7 Public tap 0.0 0.3 0.1
#> 8 Well in residence 1.1 3.7 1.6
#> 9 Public well 0.0 0.4 0.1
#> 10 Spring 0.0 2.3 0.4
#> # … with 44 more rows
R - Split one column into two when the divider is a dot
something like this?
df1 %>% separate(Sequence.Name, into = c("Col1", "Col2"))
Col1 Col2 var1
1 2 1 1
2 2 1 1
3 2 1 0
Split one column into two, retain original value if there aren't two values
This is a good use case for dplyr::coalesce
, which (akin to the SQL function it's named for) returns the first non-NA element from a set of vectors.
library(dplyr)
library(tidyr)
data %>%
separate(GeneLocation, c('Start_Position', 'Stop_Position')) %>%
mutate(Stop_Position = coalesce(Stop_Position, Start_Position))
Split a column into 2 columns with . separator R
We can get the data in long format and then use separate
library(dplyr)
library(tidyr)
pivot_longer(the_data, cols = everything()) %>%
separate(value, into = c('alpha', 'beta'), sep = "\\.") %>%
select(-name)
# A tibble: 4 x 2
# alpha beta
# <chr> <chr>
#1 a 1
#2 b 2
#3 c 3
#4 d 4
Using base R, we can split the unlisted string on "."
, convert it into two column dataframe and add names to it.
setNames(do.call(rbind.data.frame, strsplit(unlist(the_data), '\\.')),
c('alpha', 'beta'))
Related Topics
Print R-Squared for All of the Models Fit with Lmlist
Pivot_Wider, Count Number of Occurrences
How to Reverse Legend (Labels and Color) So High Value Starts at Bottom
How to Transpose a Tibble() in R
Draw a Trend Line Using Ggplot
Produce a Table Spanning Multiple Pages Using Kable()
How to Create an Infix %Between% Operator
Change a Column from Birth Date to Age in R
Extract Name of Data.Frame in R as Character
Replace a Subset of a Data Frame with Dplyr Join Operations
Intersecting Points and Polygons in R
R Plots: How to Draw a Border, Shadow or Buffer Around Text Labels
R: Further Subset a Selection Using the Pipe %>% and Placeholder
Calculate Row Sum and Product in Data.Frame