R: gsub, pattern = vector and replacement = vector
Lot's of solutions already, here are one more:
The qdap package:
library(qdap)
names(x1) <- mgsub(a,b,names(x1))
Using gsub with pattern and x as vector
If we are looking for corresponding elements of 'X' and 'Y' in the sub
then Map
can be used
db[!is.na(Y), Z := unlist(Map(sub, pattern = Y, X, replacement = ""))]
db
# X Y Z
#1: Joe Snow Joe Snow
#2: Sony Ericson Sony Ericson
#3: JP Morgan JP Morgan
#4: KATAKURI NA NA
Or another option is map/pmap
functions from purrr
library(purrr)
library(dplyr)
db %>%
set_names(c('x', 'pattern')) %>%
pmap_chr(., sub, replacement = '') %>%
trimws %>%
bind_cols(db, z = .)
# X Y z
#1: Joe Snow Joe Snow
#2: Sony Ericson Sony Ericson
#3: JP Morgan JP Morgan
#4: KATAKURI NA NA
R how to replace/gsub a vector of values by another vector of values in a datatable
You can try with lubridate::parse_date_time()
and which takes a vector of candidate formats to attempt in the conversion:
library(lubridate)
library(data.table)
MWE[, date := parse_date_time(date, orders = c("bY","qY", "Y"))]
date value
1: 2020-01-01 -0.4948354
2: 2020-02-01 1.0227036
3: 2020-01-01 2.6285688
4: 2020-01-01 1.9158595
replace string in R giving a vector of patterns and vector of replacements
Try
library(qdap)
mgsub(c('[%VAR1%]' , '[%VAR2%]'), c('val-1', 'val-2'), tt_ori)
#[1] "I have val-1 and val-2"
data
tt_ori <- 'I have [%VAR1%] and [%VAR2%]'
R: Using gsub to replace a digit matched by pattern (n) with (n-1) in character vector
We can do this easily with gsubfn
library(gsubfn)
gsubfn("([0-9]+)", ~as.numeric(x)-1, chrvector)
#[1] "str97" "v197exdf"
Or for the last digit
gsubfn("([0-9])([^0-9]*)$", ~paste0(as.numeric(x)-1, y), chrvector2)
#[1] "str97" "v197exdf" "v33chr138d"
data
chrvector <- c("str98", "v198exdf")
chrvector2 <- c("str98", "v198exdf", "v33chr139d")
gsub() not working if I reference a column using a character vector?
gsub
is being given a vector of strings, and it does what it knows: works on the strings. It doesn't know that they should be an indirect reference. (Nothing will know that it should be indirect.)
You have two options:
The canonical way in
data.table
for this is likely to use.SDcols
.preferences[, (cols) := lapply(.SD, gsub, pattern = "UN1", replacement = "A"), .SDcols = cols]
preferences
# Pref_1
# <char>
# 1: A
# 2: Food and Agriculture Organization (F...
# 3: United Nations Educational, Scientif...
# 4: United Nations Development Programme...
# 5: Commission on Narcotic Drugs (CND)
# 6: Commission on Narcotic Drugs (CND)
# 7: Human Rights Council (HRC)
# 8: A
# 9: Human Rights Council (HRC)
# 10: AThis does two things: (i) the use of
.SDcols
for iterating over a dynamic set of columns is preferred and faster, and allows programmatic determination of those columns (what you need); (ii) usinglapply
allows you to do this to one or more columns. If you know you'll always do just one column, this still works well with very little overhead.You can
get
/mget
the data. This is the way to tell something to grab the contents of a variable identified in a string vector.If you know that you will always have exactly one column, then you can use
get
:preferences[, (cols) := gsub(get(cols), pattern = "UN1", replacement = "A")]
If there is even a chance that you'll have more than one, I strongly recommend
mget
. (Even if you think you'll always have one, this is still safe.)preferences[, (cols) := lapply(mget(cols), gsub, pattern = "UN1", replacement = "A")]
Data
preferences <- setDT(structure(list(Pref_1 = c("UN1", "Food and Agriculture Organization (FAO)", "United Nations Educational, Scientific and Cultural Organization (UNESCO)", "United Nations Development Programme (UNDP)", "Commission on Narcotic Drugs (CND)", "Commission on Narcotic Drugs (CND)", "Human Rights Council (HRC)", "UN1", "Human Rights Council (HRC)", "UN1")), class = c("data.table", "data.frame"), row.names = c(NA, -10L)))
cols <- "Pref_1"
R, str_replace, gsub, how to replace a vector of characters for another vector of characters?
An option would be to paste
the individual characters as a pattern string wrapped by square brackets to evaluate it literally (in case there are meta characters) and then replace with blank (""
) in gsub
pat <- paste0("[^", gsub("\\s{2,}", " ", paste(permitted_seq_chars, collapse="")), "]")
gsub(pat, "", test_col$sequence)
#[1] "ATGCRYSW" "ATGCRYSW" "ATGCRYSW"
#[4] "ATGCRYSWATGCRYSW" "ATGCRYSW"
Match and replace multiple strings in a vector of text without looping in R
1) gsubfn gsubfn
in the gsubfn package is like gsub
except the replacement string can be a character string, list, function or proto object. If its a list it will replace each matched string with the component of the list whose name equals the matched string.
library(gsubfn)
gsubfn("\\S+", setNames(as.list(b), a), c)
giving:
[1] "i am going to the party" "he would go too"
2) gsub For a solution with no packages try this loop:
cc <- c
for(i in seq_along(a)) cc <- gsub(a[i], b[i], cc, fixed = TRUE)
giving:
> cc
[1] "i am going to the party" "he would go too"
R: pass a vector of strings to replace all instances within a string
We can use gsubfn
if we need to replace with numbers.
library(gsubfn)
gsubfn("\\w+", as.list(setNames(1:3, numlist)), mystring)
#[1] "I have 1 cat, 2 dogs and 3 rabbits"
EDIT: I thought that we need to replace with numbers that corresponds to the words in 'numlist'. But, iff we need to replace with ##NUMBER##
flag, one option is mgsub
library(qdap)
mgsub(numlist, "##NUMBER##", mystring)
#[1] "I have ##NUMBER## cat, ##NUMBER## dogs and ##NUMBER## rabbits"
Cumulative application of a gsub sequence in R
The issue with mapply
is that it is looking at a fresh copy of the FEN string for each replacement, which is not what you need. I think you can use a Reduce
mindset:
(BTW, your pattern for "5" has 6 ones, this fixed that.)
pattern <- c("11111111","1111111","111111","11111","1111","111","11")
Reduce(function(txt, ptn) gsub(ptn, as.character(nchar(ptn)), txt), pattern, init=FENCodeToBeChanged)
# [1] "rnbq1rk1/pppp1ppp/1b2pn2/8/2PP4/5NP1/PP2PPBP/RNBQ1RK1 w KQkq c6 0 2"
To be able to reduce over multiple arguments takes a little bit of work, usually iterating along a list of pairs or such. With this problem, it's easy enough to replace a pattern with its length instead of including another vector of strings, ergo nchar(ptn)
. (Technically as.character(.)
is not required as gsub
will implicitly convert it, but I wanted to be a bit "declarative" in that that's what I want. There are many tools in R that are less deterministic in this way (e.g., ifelse
). Style.)
Related Topics
What Are the Main Differences Between R Data Files
What Does the Dplyr Period Character "." Reference
Can Dplyr Summarise Over Several Variables Without Listing Each One
Sample N Random Rows Per Group in a Dataframe
Select Multiple Columns in Data.Table by Their Numeric Indices
Replace All Particular Values in a Data Frame
Using the Rjava Package on Win7 64 Bit With R
Remove Extra Legends in Ggplot2
Rotating X Axis Labels in R For Barplot
Repeat Rows of a Data.Frame N Times
Programming With Dplyr Using String as Input