R: Gsub, Pattern = Vector and Replacement = Vector

R: gsub, pattern = vector and replacement = vector

Lot's of solutions already, here are one more:

The qdap package:

library(qdap)
names(x1) <- mgsub(a,b,names(x1))

Using gsub with pattern and x as vector

If we are looking for corresponding elements of 'X' and 'Y' in the sub then Map can be used

db[!is.na(Y), Z := unlist(Map(sub, pattern = Y, X, replacement = ""))]
db
# X Y Z
#1: Joe Snow Joe Snow
#2: Sony Ericson Sony Ericson
#3: JP Morgan JP Morgan
#4: KATAKURI NA NA

Or another option is map/pmap functions from purrr

library(purrr)
library(dplyr)
db %>%
set_names(c('x', 'pattern')) %>%
pmap_chr(., sub, replacement = '') %>%
trimws %>%
bind_cols(db, z = .)
# X Y z
#1: Joe Snow Joe Snow
#2: Sony Ericson Sony Ericson
#3: JP Morgan JP Morgan
#4: KATAKURI NA NA

R how to replace/gsub a vector of values by another vector of values in a datatable

You can try with lubridate::parse_date_time() and which takes a vector of candidate formats to attempt in the conversion:

library(lubridate)
library(data.table)

MWE[, date := parse_date_time(date, orders = c("bY","qY", "Y"))]

date value
1: 2020-01-01 -0.4948354
2: 2020-02-01 1.0227036
3: 2020-01-01 2.6285688
4: 2020-01-01 1.9158595

replace string in R giving a vector of patterns and vector of replacements

Try

library(qdap)
mgsub(c('[%VAR1%]' , '[%VAR2%]'), c('val-1', 'val-2'), tt_ori)
#[1] "I have val-1 and val-2"

data

 tt_ori <- 'I have [%VAR1%] and [%VAR2%]'

R: Using gsub to replace a digit matched by pattern (n) with (n-1) in character vector

We can do this easily with gsubfn

library(gsubfn)
gsubfn("([0-9]+)", ~as.numeric(x)-1, chrvector)
#[1] "str97" "v197exdf"

Or for the last digit

gsubfn("([0-9])([^0-9]*)$", ~paste0(as.numeric(x)-1, y), chrvector2)
#[1] "str97" "v197exdf" "v33chr138d"

data

chrvector <- c("str98", "v198exdf")
chrvector2 <- c("str98", "v198exdf", "v33chr139d")

gsub() not working if I reference a column using a character vector?

gsub is being given a vector of strings, and it does what it knows: works on the strings. It doesn't know that they should be an indirect reference. (Nothing will know that it should be indirect.)

You have two options:

  1. The canonical way in data.table for this is likely to use .SDcols.

    preferences[, (cols) := lapply(.SD, gsub, pattern = "UN1", replacement = "A"), .SDcols = cols]
    preferences
    # Pref_1
    # <char>
    # 1: A
    # 2: Food and Agriculture Organization (F...
    # 3: United Nations Educational, Scientif...
    # 4: United Nations Development Programme...
    # 5: Commission on Narcotic Drugs (CND)
    # 6: Commission on Narcotic Drugs (CND)
    # 7: Human Rights Council (HRC)
    # 8: A
    # 9: Human Rights Council (HRC)
    # 10: A

    This does two things: (i) the use of .SDcols for iterating over a dynamic set of columns is preferred and faster, and allows programmatic determination of those columns (what you need); (ii) using lapply allows you to do this to one or more columns. If you know you'll always do just one column, this still works well with very little overhead.

  2. You can get/mget the data. This is the way to tell something to grab the contents of a variable identified in a string vector.

    If you know that you will always have exactly one column, then you can use get:

    preferences[, (cols) := gsub(get(cols), pattern = "UN1", replacement = "A")]

    If there is even a chance that you'll have more than one, I strongly recommend mget. (Even if you think you'll always have one, this is still safe.)

    preferences[, (cols) := lapply(mget(cols), gsub, pattern = "UN1", replacement = "A")]

Data

preferences <- setDT(structure(list(Pref_1 = c("UN1", "Food and Agriculture Organization (FAO)", "United Nations Educational, Scientific and Cultural Organization (UNESCO)", "United Nations Development Programme (UNDP)", "Commission on Narcotic Drugs (CND)", "Commission on Narcotic Drugs (CND)", "Human Rights Council (HRC)", "UN1", "Human Rights Council (HRC)", "UN1")), class = c("data.table", "data.frame"), row.names = c(NA, -10L)))
cols <- "Pref_1"

R, str_replace, gsub, how to replace a vector of characters for another vector of characters?

An option would be to paste the individual characters as a pattern string wrapped by square brackets to evaluate it literally (in case there are meta characters) and then replace with blank ("") in gsub

pat <- paste0("[^", gsub("\\s{2,}", " ", paste(permitted_seq_chars, collapse="")), "]")
gsub(pat, "", test_col$sequence)
#[1] "ATGCRYSW" "ATGCRYSW" "ATGCRYSW"
#[4] "ATGCRYSWATGCRYSW" "ATGCRYSW"

Match and replace multiple strings in a vector of text without looping in R

1) gsubfn gsubfn in the gsubfn package is like gsub except the replacement string can be a character string, list, function or proto object. If its a list it will replace each matched string with the component of the list whose name equals the matched string.

library(gsubfn)
gsubfn("\\S+", setNames(as.list(b), a), c)

giving:

[1] "i am going to the party" "he would go too"    

2) gsub For a solution with no packages try this loop:

cc <- c
for(i in seq_along(a)) cc <- gsub(a[i], b[i], cc, fixed = TRUE)

giving:

> cc
[1] "i am going to the party" "he would go too"

R: pass a vector of strings to replace all instances within a string

We can use gsubfn if we need to replace with numbers.

 library(gsubfn)
gsubfn("\\w+", as.list(setNames(1:3, numlist)), mystring)
#[1] "I have 1 cat, 2 dogs and 3 rabbits"

EDIT: I thought that we need to replace with numbers that corresponds to the words in 'numlist'. But, iff we need to replace with ##NUMBER## flag, one option is mgsub

 library(qdap)
mgsub(numlist, "##NUMBER##", mystring)
#[1] "I have ##NUMBER## cat, ##NUMBER## dogs and ##NUMBER## rabbits"

Cumulative application of a gsub sequence in R

The issue with mapply is that it is looking at a fresh copy of the FEN string for each replacement, which is not what you need. I think you can use a Reduce mindset:

(BTW, your pattern for "5" has 6 ones, this fixed that.)

pattern <- c("11111111","1111111","111111","11111","1111","111","11")
Reduce(function(txt, ptn) gsub(ptn, as.character(nchar(ptn)), txt), pattern, init=FENCodeToBeChanged)
# [1] "rnbq1rk1/pppp1ppp/1b2pn2/8/2PP4/5NP1/PP2PPBP/RNBQ1RK1 w KQkq c6 0 2"

To be able to reduce over multiple arguments takes a little bit of work, usually iterating along a list of pairs or such. With this problem, it's easy enough to replace a pattern with its length instead of including another vector of strings, ergo nchar(ptn). (Technically as.character(.) is not required as gsub will implicitly convert it, but I wanted to be a bit "declarative" in that that's what I want. There are many tools in R that are less deterministic in this way (e.g., ifelse). Style.)



Related Topics



Leave a reply



Submit