Remove parentheses and text within from strings in R
A gsub
should work here
gsub("\\s*\\([^\\)]+\\)","",as.character(companies$Name))
# or using "raw" strings as of R 4.0
gsub(r"{\s*\([^\)]+\)}","",as.character(companies$Name))
# [1] "Company A Inc" "Company B" "Company C Inc."
# [4] "Company D Inc." "Company E"
Here we just replace occurrences of "(...)" with nothing (also removing any leading space). R makes it look worse than it is with all the escaping we have to do for the parenthesis since they are special characters in regular expressions.
Removing parenthesis in R
These are metacharacters that either needs to be escaped (with \\
) or we can place it in a square bracket to read it as character.
gsub("[()]", "", x)
#[1] "40.703707008, -73.943257966"
r - How can I remove a single pair of parentheses with text in a character?
Use sub
:
sub('\\(.*?\\)\\s', '', value)
#[1] "This is a (keep) test sentence."
()
are metacharacters and need to be escaped with\\
..*?
is to match as few characters possible till a closing bracket ()
) is encountered.
How to remove square parentheses and text within from strings in R
I would use:
input <- c("6.77[9]", "5.92[10]", "2.98[103]")
gsub("\\[.*?\\]", "", input)
[1] "6.77" "5.92" "2.98"
The regex pattern \[.*?\]
should match any quoted terms in square brackets, and using gsub
would tell R to replace all such terms.
How to remove parenthesis and inside text in r?
The following two regexp solve the two problems in the question.
s <- "species name(2) V1"
sub("(^[^(]*)\\(.*$", "\\1", s)
#[1] "species name"
sub("\\([^)]*\\)", "", s)
#[1] "species name V1"
Now apply them to the column of interest.
Mutate to remove all parenthesis (and contents) from string in R
Another trick is:
my_order <- c("CD68", "PD-1", "FoxP3", "CD8", "PD-L1", "PanCK")
test %>%
mutate(prototype = gsub('\\s*[(][^)]+[)]','',Class),
ordered = map_chr(strsplit(prototype, '\\s*:\\s*'),
~str_c(sort(ordered(.x,my_order), decreasing = TRUE), collapse = ":")))
Class prototype ordered
1 FoxP3 (Opal 570): PanCK (Opal 690): PD-1 (Opal 620): CD68 (Opal 780) FoxP3: PanCK: PD-1: CD68 PanCK:FoxP3:PD-1:CD68
2 CD8 (Opal 480): PanCK (Opal 690): CD68 (Opal 780): PD-L1 (Opal 520) CD8: PanCK: CD68: PD-L1 PanCK:PD-L1:CD8:CD68
3 PanCK (Opal 690): CD68 (Opal 780) PanCK: CD68 PanCK:CD68
4 FoxP3 (Opal 570): PanCK (Opal 690) FoxP3: PanCK PanCK:FoxP3
Removing text contained in brackets/parentheses from corpus (R)
You can remove all texts in brackets using gsub()
. As you plan to remove the punctuation in a next step, you can replace them with .
, just to indicate where something was taken (if you need to debug the pipeline) or you can replace them with an empty string ""
.
Your regex would not work. You need to escape the brackets with double back-slashes and you will want to remove multiple, but as few as possible, characters. You'll need the regex *?
for the contents of the brackets:
corp = c("This is an example (or demonstration) of replacing things in brackets",
"Just use gsub (a function in base) to remove (or better replace) these elements")
corp = gsub("\\(.*?\\)",".",corp)
The example above would result in the vector:
> corp
[1] "This is an example . of replacing things in brackets"
[2] "Just use gsub . to remove . these elements"
Depending on the package you use for your corpus, you can do this with the character vector before converting it to a corpus or you can use specific mapping functions (e.g. tm_map()
in tm
) to apply it to all texts.
stringr: Removing Parentheses and Brackets from string
We can use |
gsub("\\)|\\]", "", Test)
#[1] "-0.158" "0.426" "1.01" "1.6" "2.18" "2.77"
or instead of escaping place the brackets inside the []
gsub("[][()]", "", Test)
#[1] "-0.158" "0.426" "1.01" "1.6" "2.18" "2.77"
If we want to do the extract instead of removing use either gregexpr/regmatches
from base R
or str_extract
from stringr
to check for patterns where a number could start with -
and include .
library(stringr)
str_extract(Test, "-?[0-9.]+")
#[1] "-0.158" "0.426" "1.01" "1.6" "2.18" "2.77"
Difficulty to remove several parentheses in a string, using stringr, in R
We can use str_remove_all
instead of str_remove
as this matches only the first instance
library(stringr)
str_remove_all(x, "[()]")
#[1] "example"
replace text within parenthesis in R
Yes, use gsub()
to replace all the text you don't want with an empty string.
x <- "Keep me (Remove Me 1). Again keep me (Remove Me 2). Again again keep me (Remove Me 3)."
Here is the regex you want:
gsub( " *\\(.*?\\) *", "", x)
[1] "Keep me. Again keep me. Again again keep me."
It works like this:
*?
finds 0 or more spaces before (and after) the parentheses.- Since
(
and)
are special symbols in a regex, you need to escape these, i.e. (\\(
- The
.*?
is a wildcard find to find all characters, where the?
means to find in a non-greedy way. This is necessary because regex is greedy by default. In other words, by default the regex will start the match at the first opening parentheses and ends the match at the last closing parentheses.
Related Topics
Reshaping Time Series Data from Wide to Tall Format (For Plotting)
Create a Co-Occurrence Matrix from Dummy-Coded Observations
Calculating Statistics on Subsets of Data
Unique on a Dataframe With Only Selected Columns
Using Stat_Function and Facet_Wrap Together in Ggplot2 in R
Ggplot2 - Jitter and Position Dodge Together
Simplest Way to Do Grouped Barplot
How to Subtract Months from a Date in R
How to Convert Dataframe into Time Series
Remove Parentheses and Text Within from Strings in R
Sample Random Rows Within Each Group in a Data.Table
How to Read in Numbers With a Comma as Decimal Separator
Geom_Bar Bars Not Displaying When Specifying Ylim
Assign Multiple Columns Using := in Data.Table, by Group
Scale a Series Between Two Points
Using Regex in R to Find Strings as Whole Words (But Not Strings as Part of Words)