Separating a Column in R

Split data frame string column into multiple columns

Use stringr::str_split_fixed

library(stringr)
str_split_fixed(before$type, "_and_", 2)

Splitting a single column into multiple columns in R

A possible solution, based on tidyverse:

library(tidyverse)

df %>%
filter(table != "_________________________________________________" ) %>%
mutate(table = str_trim(table)) %>%
separate(table, sep = "\\s+(?=\\d+)",
into = c("Characteristic", "Urban", "Rural", "Total"), fill = "right") %>%
filter(Characteristic != "") %>%
slice(-1)

#> # A tibble: 54 × 4
#> Characteristic Urban Rural Total
#> <chr> <chr> <chr> <chr>
#> 1 Electricity <NA> <NA> <NA>
#> 2 Yes 99.8 94.4 98.9
#> 3 No 0.2 5.6 1.1
#> 4 Total 100.0 100.0 100.0
#> 5 Source of drinking water <NA> <NA> <NA>
#> 6 Piped into residence 97.1 81.4 94.4
#> 7 Public tap 0.0 0.3 0.1
#> 8 Well in residence 1.1 3.7 1.6
#> 9 Public well 0.0 0.4 0.1
#> 10 Spring 0.0 2.3 0.4
#> # … with 44 more rows

How to split a column into multiple (non equal) columns in R

We could use cSplit from splitstackshape

library(splitstackshape)
cSplit(DF, "Col1",",")

-output

cSplit(DF, "Col1",",")
Col1_1 Col1_2 Col1_3 Col1_4
1: a b c <NA>
2: a b <NA> <NA>
3: a b c d

How to split up a column of a dataframe into new columns in R?

With tidyverse, we could create a new group everytime c appears in the x column, then we can pivot the data wide. Generally, duplicate names are discouraged, so I created a sequential c column names.

library(tidyverse)

results <- df %>%
group_by(idx = cumsum(x == "c")) %>%
filter(x != "c") %>%
mutate(rn = row_number()) %>%
pivot_wider(names_from = idx, values_from = x, names_prefix = "c_") %>%
select(-rn)

Output

  c_1   c_2   c_3  
<chr> <chr> <chr>
1 a b d
2 a b d
3 a b d
4 a b d

However, if you really want duplicate names, then we could add on set_names:

purrr::set_names(results, "c")

c c c
<chr> <chr> <chr>
1 a b d
2 a b d
3 a b d
4 a b d

Or in base R, we could create the grouping with cumsum, then split those groups, then bind back together with cbind. Then, we remove the first row that contains the c characters.

names(df) <- "c"
do.call(cbind, split(df, cumsum(df$c == "c")))[-1,]

# c c c
#2 a b d
#3 a b d
#4 a b d
#5 a b d

split the string in the rows to separate columns in R

You could use separate_rows and pivot_wider:

library(tidyverse)

M %>%
separate_rows(mapped) %>%
pivot_wider(names_from = mapped, values_from = mapped) %>%
relocate(order(colnames(.)))

# A tibble: 3 x 5
name X1 X2 X3 X4
<chr> <chr> <chr> <chr> <chr>
1 A X1 NA X3 X4
2 B NA X2 NA X4
3 C NA X2 X3 X4

Then to count the number of values per column, use
:

colSums(!is.na(M[,-1]))
# X1 X2 X3 X4
# 1 2 2 3

How to split a dataframe column into two columns

read.table(text=df$X1, sep=':', fill=T, h=F, dec = '/')
V1 V2
1 NA
2 1.0 0.82
3 1.1 1.995
4 0.1 1.146
5 NA
6 1.1 1.995

If you want columns in respective data.types:

type.convert(read.table(text=df$X1, sep=':', fill=T, h=F, dec = '/'), as.is = TRUE)
V1 V2
1 NA NA
2 1.0 0.820
3 1.1 1.995
4 0.1 1.146
5 NA NA
6 1.1 1.995


df <- structure(list(X1 = c(NA, "1/0:0.82", "1/1:1.995", "0/1:1.146", NA,
"1/1:1.995")), class = "data.frame", row.names = c(NA, -6L))


Related Topics



Leave a reply



Submit