How to remove + (plus sign) from string in R?
Try
test<- "sandwich=bread-mustard+ketchup"
test<-gsub("\\+","_",test)
test
[1] "sandwich=bread-mustard_ketchup"
+
is a special character. You need to escape it. Same as, for instance, .
. If you google regex
or regular expressions, you will find the corresponding lists of special characters. For instance, here +
is described to indicate 1 or more of previous expression
. More about special characters, regular expressions and R can be found here or here.
On a more general note, your above code could be written more efficiently by using:
test<- "sandwich=bread-mustard+ketchup"
test<-gsub("[-|=|\\+]","_",test)
test
[1] "sandwich_bread_mustard_ketchup"
Here I have used a construct that can basically be read as [either this or that or something else]
, where |
corresponds to or
.
Remove all special characters from a string in R?
You need to use regular expressions to identify the unwanted characters. For the most easily readable code, you want the str_replace_all
from the stringr
package, though gsub
from base R works just as well.
The exact regular expression depends upon what you are trying to do. You could just remove those specific characters that you gave in the question, but it's much easier to remove all punctuation characters.
x <- "a1~!@#$%^&*(){}_+:\"<>?,./;'[]-=" #or whatever
str_replace_all(x, "[[:punct:]]", " ")
(The base R equivalent is gsub("[[:punct:]]", " ", x)
.)
An alternative is to swap out all non-alphanumeric characters.
str_replace_all(x, "[^[:alnum:]]", " ")
Note that the definition of what constitutes a letter or a number or a punctuatution mark varies slightly depending upon your locale, so you may need to experiment a little to get exactly what you want.
Remove plus sign (+) from string
Although the original answer to this question does achieve the intended effect, it is not the most efficient way to do this simple task. As noted in the comments above, the use of str_replace()
is preferred in this case.
$variation = str_replace("+", "", $variation);
ORIGINAL ANSWER:
This works to remove only a plus sign:
$variation = preg_replace(/[+]/, "", $variation);
You can see it work here: http://www.phpliveregex.com/p/1Fb (be sure you select the preg_replace function)
how can I remove two consecutive pluses (+) from a formula/string?
Something like
as.formula( gsub( ""\\+s*\\+", "+", deparse(f)))
where f
is your formula.
How to replace '+' using gsub() function in R
Simply replace it with fixed = TRUE
(no need to use a regular expression) but you have to do the replacement for each "column" of the data.frame by specifying the column name:
txtdf <- data.frame(job = c("government", "poli+tician", "parliament"))
txtdf
gives
job
1 government
2 poli+tician
3 parliament
Now replace the "+":
txtdf$job <- gsub("+", "", txtdf$job, fixed = TRUE)
txtdf
The result is:
job
1 government
2 politician
3 parliament
Remove part of string after .
You just need to escape the period:
a <- c("NM_020506.1","NM_020519.1","NM_001030297.2","NM_010281.2","NM_011419.3", "NM_053155.2")
gsub("\\..*","",a)
[1] "NM_020506" "NM_020519" "NM_001030297" "NM_010281" "NM_011419" "NM_053155"
Split a string by a plus sign (+) character
Use
strsplit("(1)+(2)", "\\+")
or
strsplit("(1)+(2)", "+", fixed = TRUE)
The idea of using strsplit("(1)+(2)", "+")
doesn't work since unless specified otherwise, the split
argument is a regular expression, and the +
character is special in regex. Other characters that also need extra care are
?
*
.
^
$
\
|
{
}
[
]
(
)
Split string in parts by minus and plus in R
We can provide a regular expression in strsplit
, where we use ?=
to lookahead to find the plus or minus sign, then split on that character. This will allow for the character itself to be retained rather than being dropped in the split.
strsplit(x, "(?<=.)(?=[+])|(?<=.)(?=[-])",perl = TRUE)
# [1] "-1x^2" "+3x^3" "-x^8" "+1" "-x"
Is there a way to keep only defined charaters in a string from a whitelist?
What about this:
string <- "opiqr8929348t89hr289r01++r42+3525"
gsub("[^0-9+]", "", string)
# [1] "89293488928901++42+3525"
This replaces everything that's not a 0-9 or plus with "".
Related Topics
Extract File Extension from File Path
How to Resolve Spherical Geometry Failures When Joining Spatial Data
Relocating Alaska and Hawaii on Thematic Map of the Usa with Ggplot2
Select Na in a Data.Table in R
Using Un-Exported Function from Another R Package
Buffer (Geo)Spatial Points in R with Gbuffer
Wrap Text Around Plots in Markdown
Avoiding the Infamous "Eval(Parse())" Construct
Merge Nearest Date, and Related Variables from a Another Dataframe by Group
Ggplot: Adding Regression Line Equation and R2 with Facet
Make R Exit with Non-Zero Status Code
Selecting a Subset of Columns in a Data.Table
How to Calculate the Probability for a Given Quantile in R
Filling in Missing (Blanks) in a Data Table, Per Category - Backwards and Forwards
Using R to Download Gzipped Data File, Extract, and Import Data
How to Create a Bipartite Network in R with Igraph or Tnet
Add Color to Boxplot - "Continuous Value Supplied to Discrete Scale" Error