How to split a column data with no separator in R
We can do this with extract
from tidyr
library(tidyr)
extract(df1, date, into = c("Year", "Month", "Day"), "(.{4})(.{2})(.{2})")
Or another option is read.csv
cbind(df1, read.csv(text=sub("(.{4})(.{2})(.{2})", "\\1,\\2,\\3",
df1$date), header=FALSE, col.names = c("Year", "Month", "Day")))
String split based on index
One option would be to use separate
from tidyverse
library(tidyverse)
tibble(col1 = str) %>%
separate(col1, into = paste0("col", 0:7), c(4, 8, 16, 20, 26, 30, 32)) %>%
select(-1)
# A tibble: 6 x 7
# col1 col2 col3 col4 col5 col6 col7
# <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#1 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#2 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#3 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#4 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#5 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#6 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
Or another option is without any packages with base R
by creating a delimiter based on position and then read with read.csv
read.csv(text = sub("^.{4}(.{4})(.{8})(.{4})(.{6})(.{4})(.{2})(.*)",
"\\1,\\2,\\3,\\4,\\5,\\6,\\7", str), header = FALSE,
stringsAsFactors = FALSE)
# V1 V2 V3 V4 V5 V6 V7
#1 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#2 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#3 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#4 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#5 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
#6 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
split a string in a text file
Input
$ cat f
19810101 20
19810102 31
19810103 1
19810701 1
19811105 5
Output
$ awk '{print substr($1,1,4),substr($1,5,2),substr($1,7),$2}' f
1981 01 01 20
1981 01 02 31
1981 01 03 1
1981 07 01 1
1981 11 05 5
For CSV
$ awk '{print substr($1,1,4),substr($1,5,2),substr($1,7),$2}' OFS=, f
1981,01,01,20
1981,01,02,31
1981,01,03,1
1981,07,01,1
1981,11,05,5
Related Topics
Calculating Time Difference Between Two Columns
Adding Empty Graphs to Facet_Wrap in Ggplot2
How to Cumulatively Add Values in One Vector in R
Efficient Calculation of Matrix Cumulative Standard Deviation in R
Elegant Way to Select the Color for a Particular Segment of a Line Plot
Adding New Column with Conditional Values Using Ifelse
Ggplot2 Make Missing Value in Geom_Tile Not Blank
Grouped Operations That Result in Length Not Equal to 1 or Length of Group in Dplyr
R: Eval(Parse(...)) Is Often Suboptimal
Bars in Geom_Bar Have Unwanted Different Widths When Using Facet_Wrap
Operations on Multiple Tables/Datasets with Edit Queries and R in Power Bi
Row-Wise Sort Then Concatenate Across Specific Columns of Data Frame
Reading Psv (Pipe-Separated) File or String
Split Time Series Data into Time Intervals (Say an Hour) and Then Plot the Count