gsub() in R is not replacing '.' (dot)
You may need to escape the .
which is a special character that means "any character" (from @Mr Flick's comment)
gsub('\\.', '-', x)
#[1] "2014-06-09"
Or
gsub('[.]', '-', x)
#[1] "2014-06-09"
Or as @Moix mentioned in the comments, we can also use fixed=TRUE
instead of escaping the characters.
gsub(".", "-", x, fixed = TRUE)
Replace dots using `gsub`
My recommendation would be to escape the "." character:
spy$Identifier <- gsub("\\.", "/", spy$Identifier)
In regular expression, a period is a special character that matches any character. "Escaping" it tells the search to look for an actual period. In R's gsub this is accomplished with two backslashes (i.e.: "\\"). In other languages, it's often just one backslash.
Unexpected outcome, not replacing, in R out of a gsub function
The sub
function doesn't work this way. One viable approach would be to capture the quantity you want, then use this capture group as the replacement:
x <- "r_con[C3-C3,Intercept]"
term <- sub("^r_con\\[([^,]+),Intercept\\]", "\\1", x)
term
[1] "C3-C3"
Replace the dot at the end of a string in R
Try :
x <- "DEL.Xp22.11..ZFX."
x <- gsub("..", ' (', x, fixed = T)
x <- gsub("\\.$", ')', x)
Here I use the regex anchor '$' to signify the end of the word. And '\' to escape the '.' that is a regex special character.
Why does gsub/sub not work to replace .. ?
We can use fixed = TRUE
as .
can match any character in the default regex mode if it is not escaped (\\.
) or placed inside square brackets ([.]
) or the faster option is fixed = TRUE
gsub("..", " ", rownames(df), fixed = TRUE)
#[1] "Saint.Petersburg Russia" "Istanbul Turkey"
Replacing a special character does not work with gsub
You have to escape the +
symbol, as it is a regex
command.
> gsub("Ã<U\\+009F>", "REPLACED", "Testing string Ã<U+009F> ")
[1] "Testing string REPLACED "
> gsub("â<U\\+0080><U\\+0093>", "REPLACED", "Testing string â<U+0080><U+0093> ")
[1] "Testing string REPLACED "
gsub() not working if I reference a column using a character vector?
gsub
is being given a vector of strings, and it does what it knows: works on the strings. It doesn't know that they should be an indirect reference. (Nothing will know that it should be indirect.)
You have two options:
The canonical way in
data.table
for this is likely to use.SDcols
.preferences[, (cols) := lapply(.SD, gsub, pattern = "UN1", replacement = "A"), .SDcols = cols]
preferences
# Pref_1
# <char>
# 1: A
# 2: Food and Agriculture Organization (F...
# 3: United Nations Educational, Scientif...
# 4: United Nations Development Programme...
# 5: Commission on Narcotic Drugs (CND)
# 6: Commission on Narcotic Drugs (CND)
# 7: Human Rights Council (HRC)
# 8: A
# 9: Human Rights Council (HRC)
# 10: AThis does two things: (i) the use of
.SDcols
for iterating over a dynamic set of columns is preferred and faster, and allows programmatic determination of those columns (what you need); (ii) usinglapply
allows you to do this to one or more columns. If you know you'll always do just one column, this still works well with very little overhead.You can
get
/mget
the data. This is the way to tell something to grab the contents of a variable identified in a string vector.If you know that you will always have exactly one column, then you can use
get
:preferences[, (cols) := gsub(get(cols), pattern = "UN1", replacement = "A")]
If there is even a chance that you'll have more than one, I strongly recommend
mget
. (Even if you think you'll always have one, this is still safe.)preferences[, (cols) := lapply(mget(cols), gsub, pattern = "UN1", replacement = "A")]
Data
preferences <- setDT(structure(list(Pref_1 = c("UN1", "Food and Agriculture Organization (FAO)", "United Nations Educational, Scientific and Cultural Organization (UNESCO)", "United Nations Development Programme (UNDP)", "Commission on Narcotic Drugs (CND)", "Commission on Narcotic Drugs (CND)", "Human Rights Council (HRC)", "UN1", "Human Rights Council (HRC)", "UN1")), class = c("data.table", "data.frame"), row.names = c(NA, -10L)))
cols <- "Pref_1"
gsub() not recognizing and replacing certain accented characters
Use stringi::stri_trans_general
:
library(stringi)
df<-data.frame(Name=c("Stipe Miočić","Duško Todorović","Michał Oleksiejczuk","Jiři Prochazka","Bartosz Fabiński","Damir Hadžović","Ľudovit Klein","Diana Belbiţă","Joanna Jędrzejczyk" ))
stri_trans_general(df$Name, "Latin-ASCII")
Results:
[1] "Stipe Miocic" "Dusko Todorovic" "Michal Oleksiejczuk"
[4] "Jiri Prochazka" "Bartosz Fabinski" "Damir Hadzovic"
[7] "Ludovit Klein" "Diana Belbita" "Joanna Jedrzejczyk"
See R proof.
Replacing dots with underscores, when using make.names or renaming obejcts in the working environment
The issue is probably that use you use "."
which in a regex matches every character. If you want to match a .
in a string you have to escape it using use "\\."
.
Personally I don't like it to wrangle all code in one line when you could use a simple function to make the code cleaner and more understandable.
# Example data
write.csv(mtcars, "mt cars.csv")
write.csv(mtcars, "mt car s.csv")
temp = list.files(pattern="*.csv")
make_names <- function(x) {
gsub("\\.", "_", make.names(gsub("*.csv$", "", x)))
}
names(temp) <- make_names(temp)
list2env(lapply(temp, read.csv), envir = .GlobalEnv)
#> <environment: R_GlobalEnv>
ls()
#> [1] "make_names" "mt_car_s" "mt_cars" "temp"
Related Topics
Shiny: Differencebetween Observeevent and Eventreactive
How to Detect the Right Encoding for Read.Csv
Ggplot2: Facet_Wrap Strip Color Based on Variable in Data Set
Create Dynamic Number of Input Elements with R/Shiny
Extract Every Nth Element of a Vector
How to Work with Large Numbers in R
Replacing Numbers Within a Range with a Factor
Subsetting a Data.Table Using !=<Some Non-Na> Excludes Na Too
Add "Filename" Column to Table as Multiple Files Are Read and Bound
Combine Rows in Data Frame Containing Na to Make Complete Row
Reading Multiple Files and Calculating Mean Based on User Input
Grouped Barplot in R with Error Bars
Remove Rows in R Matrix Where All Data Is Na
Specification of First and Last Tick Marks with Scale_X_Date
How to Make R Beep/Play a Sound at the End of a Script