Split data frame string column into multiple columns
Use stringr::str_split_fixed
library(stringr)
str_split_fixed(before$type, "_and_", 2)
Splitting a single column into multiple columns in R
A possible solution, based on tidyverse
:
library(tidyverse)
df %>%
filter(table != "_________________________________________________" ) %>%
mutate(table = str_trim(table)) %>%
separate(table, sep = "\\s+(?=\\d+)",
into = c("Characteristic", "Urban", "Rural", "Total"), fill = "right") %>%
filter(Characteristic != "") %>%
slice(-1)
#> # A tibble: 54 × 4
#> Characteristic Urban Rural Total
#> <chr> <chr> <chr> <chr>
#> 1 Electricity <NA> <NA> <NA>
#> 2 Yes 99.8 94.4 98.9
#> 3 No 0.2 5.6 1.1
#> 4 Total 100.0 100.0 100.0
#> 5 Source of drinking water <NA> <NA> <NA>
#> 6 Piped into residence 97.1 81.4 94.4
#> 7 Public tap 0.0 0.3 0.1
#> 8 Well in residence 1.1 3.7 1.6
#> 9 Public well 0.0 0.4 0.1
#> 10 Spring 0.0 2.3 0.4
#> # … with 44 more rows
How to split a column into multiple (non equal) columns in R
We could use cSplit
from splitstackshape
library(splitstackshape)
cSplit(DF, "Col1",",")
-output
cSplit(DF, "Col1",",")
Col1_1 Col1_2 Col1_3 Col1_4
1: a b c <NA>
2: a b <NA> <NA>
3: a b c d
How to split up a column of a dataframe into new columns in R?
With tidyverse
, we could create a new group everytime c
appears in the x
column, then we can pivot the data wide. Generally, duplicate names are discouraged, so I created a sequential c
column names.
library(tidyverse)
results <- df %>%
group_by(idx = cumsum(x == "c")) %>%
filter(x != "c") %>%
mutate(rn = row_number()) %>%
pivot_wider(names_from = idx, values_from = x, names_prefix = "c_") %>%
select(-rn)
Output
c_1 c_2 c_3
<chr> <chr> <chr>
1 a b d
2 a b d
3 a b d
4 a b d
However, if you really want duplicate names, then we could add on set_names
:
purrr::set_names(results, "c")
c c c
<chr> <chr> <chr>
1 a b d
2 a b d
3 a b d
4 a b d
Or in base R, we could create the grouping with cumsum
, then split those groups, then bind back together with cbind
. Then, we remove the first row that contains the c
characters.
names(df) <- "c"
do.call(cbind, split(df, cumsum(df$c == "c")))[-1,]
# c c c
#2 a b d
#3 a b d
#4 a b d
#5 a b d
split the string in the rows to separate columns in R
You could use separate_rows
and pivot_wider
:
library(tidyverse)
M %>%
separate_rows(mapped) %>%
pivot_wider(names_from = mapped, values_from = mapped) %>%
relocate(order(colnames(.)))
# A tibble: 3 x 5
name X1 X2 X3 X4
<chr> <chr> <chr> <chr> <chr>
1 A X1 NA X3 X4
2 B NA X2 NA X4
3 C NA X2 X3 X4
Then to count the number of values per column, use
:
colSums(!is.na(M[,-1]))
# X1 X2 X3 X4
# 1 2 2 3
How to split a dataframe column into two columns
read.table(text=df$X1, sep=':', fill=T, h=F, dec = '/')
V1 V2
1 NA
2 1.0 0.82
3 1.1 1.995
4 0.1 1.146
5 NA
6 1.1 1.995
If you want columns in respective data.types:
type.convert(read.table(text=df$X1, sep=':', fill=T, h=F, dec = '/'), as.is = TRUE)
V1 V2
1 NA NA
2 1.0 0.820
3 1.1 1.995
4 0.1 1.146
5 NA NA
6 1.1 1.995
df <- structure(list(X1 = c(NA, "1/0:0.82", "1/1:1.995", "0/1:1.146", NA,
"1/1:1.995")), class = "data.frame", row.names = c(NA, -6L))
Related Topics
Displaying Data in the Chart Based on Plotly_Click in R Shiny
How to Calculate the Average of a Variable Between Two Date Ranges Using a Loop or Apply Function
Hyperlinking Text in a Ggplot2 Visualization
How to Create Textarea as Input in a Shiny Webapp in R
How to Separate Title Page and Table of Content Page from Knitr Rmarkdown PDF
Code Organisation in R Package Development
Extract Text from Two-Column PDF with R
Replace Two Dots in a String with Gsub
Why Is This Naive Matrix Multiplication Faster Than Base R'S
R - How to Test for Character(0) in If Statement
Weird As.Posixct Behavior Depending on Daylight Savings Time
Filter Based on Number of Distinct Values Per Group
Circular Heatmap That Looks Like a Donut
Differences in Heatmap/Clustering Defaults in R (Heatplot Versus Heatmap.2)
R Package Xtable, How to Create a Latextable with Multiple Rows and Columns from R
How to Reduce Space Gap Between Multiple Graphs in R