tidyr separate only first n instances
You need the extra
argument with the "merge"
option. This allows only as many splits as you have new columns defined.
separate(df, V1, c("V1", "V2", "V3", "V4"), extra = "merge")
V1 V2 V3 V4
1 Value is the best_one
2 This is the prettiest_thing_I've_ever_seen
3 Here is the next_example_of_what_I_want
tidyr separate only last n instances
When screening the already answered similar questions, I discovered tidyr::extract
in this answer, which can be used to do the job:
tmp2 %>% extract(
"varTreatName", c("varName","treatment","canopyPosition")
, regex = "(.*)_([^_]+)_([^_]+)$")
yielding the expected result:
varName treatment canopyPosition
1 resp Nadd belowCanopy
2 resp NPadd belowCanopy
3 resp_sd Nadd belowCanopy
4 resp_sd NPadd belowCanopy
Applying tidyr separate only to specific rows
Another approach:
cols_to_split = c('here_do')
clean_df <-df %>%
filter(text %in% cols_to_split) %>%
tidyr::separate(text,into=c("first","sec"),sep="_",remove=F) %>%
bind_rows(filter(df, !text %in% cols_to_split))
# var_a var_b text first sec
#1 b 7 here_do here do
#2 a 26 foo_bla <NA> <NA>
#3 c 23 oh_yes <NA> <NA>
#4 d 2 baa <NA> <NA>
#5 e 67 land <NA> <NA>
If you need to keep rest of the rows in column 'first', you may use:
clean_df <-df %>%
filter(text %in% cols_to_split) %>%
tidyr::separate(text,into=c("first","sec"),sep="_",remove=F) %>%
bind_rows(filter(df, !text %in% cols_to_split)) %>%
mutate(first = ifelse(is.na(first), as.character(text), first))
# var_a var_b text first sec
#1 b 7 here_do here do
#2 a 26 foo_bla foo_bla <NA>
#3 c 23 oh_yes oh_yes <NA>
#4 d 2 baa baa <NA>
#5 e 67 land land <NA>
Specify separator character in separate function from package tidyr
Here is a way to solve the problem.
d %>% separate(var, into = c("newcol1", "newcol2"), sep = "_(?=.*_)")
Here, the regex _(?=.*_)
means: _
followed by a string including another _
.
The result:
# A tibble: 5 x 2
newcol1 newcol2
<chr> <chr>
1 A 1_a
2 B 2_b
3 C 3_c
4 D 4_d
5 E 5_e
Using regex and tidyr in R to split column variable on first instance of match
You need to specify the extra
parameter to be merge
:
library(tidyr)
df %>% separate(date, c("day", "date"), extra = "merge")
# game day date
#1 1 Monday Apr 3
#2 2 Tuesday Apr 4
#3 3 Wednesday Apr 5
#4 4 Thursday Apr 6
#5 5 Friday Apr 7
#6 6 Saturday Apr 8
How to split a dataframe column by the first instance of a character in its values
Another option might be to use tidyr::separate
:
separate(x,a,into = c("b","c"),sep = "_",remove = FALSE,extra = "merge")
Separate column into three columns with grouping
Use extra
argument:
# dummy data
df1 <- data.frame(x = c(
"some name1",
"justOneName",
"some three name",
"Abdullaeva Mehseti Nuraddin Kyzy"))
library(tidyr)
library(dplyr)
df1 %>%
separate(x, c("a1", "a2", "a3"), extra = "merge")
# a1 a2 a3
# 1 some name1 <NA>
# 2 justOneName <NA> <NA>
# 3 some three name
# 4 Abdullaeva Mehseti Nuraddin Kyzy
# Warning message:
# Too few values at 2 locations: 1, 2
From manual:
extra
If sep is a character vector, this controls what happens when
there are too many pieces. There are three valid options:
- "warn" (the default): emit a warning and drop extra values.
- "drop": drop any extra values without a warning.
- "merge": only splits at most length(into) times
Related Topics
R Calculate the Average of One Column Corresponding to Each Bin of Another Column
How to Install the Odbc Driver for Snowflake Successfully on an M1 Apple Silicon MAC
How to Bookmark and Restore Dynamically Added Modules
Twitter Sentiment Analysis W R Using German Language Set Sentiws
Legend Venn Diagram in Venneuler
R Create Function to Add Water Year Column
R - Download Filtered Datatable
Display Error Instead of Plot in Shiny Web App
Extract Time (Hms) from Lubridate Date Time Object
Programming with Ggplot2 and Dplyr
How to Download a Large Binary File with Rcurl *After* Server Authentication
Knitr: Object Cannot Be Found When Converting Markdown File into HTML
What Is the "Embracing Operator" '{{ }}'
Scraping a Complex HTML Table into a Data.Frame in R
How to Get Covariance Matrix for Random Effects (Blups/Conditional Modes) from Lme4
How to Check If Each Element in a Vector Is Integer or Not in R