Remove Parenthesis from a Character String

Remove parentheses and text within from strings in R

A gsub should work here

gsub("\\s*\\([^\\)]+\\)","",as.character(companies$Name))
# or using "raw" strings as of R 4.0
gsub(r"{\s*\([^\)]+\)}","",as.character(companies$Name))

# [1] "Company A Inc" "Company B" "Company C Inc."
# [4] "Company D Inc." "Company E"

Here we just replace occurrences of "(...)" with nothing (also removing any leading space). R makes it look worse than it is with all the escaping we have to do for the parenthesis since they are special characters in regular expressions.

Removing parenthesis in R

These are metacharacters that either needs to be escaped (with \\) or we can place it in a square bracket to read it as character.

gsub("[()]", "", x)
#[1] "40.703707008, -73.943257966"

Removing parenthesis from string with regex

You've specified two characters to match:

  1. Any character that isn't a letter ([^a-zA-Z]), immediately followed by
  2. Any character at all (.)

The first time in the string that this criteria is met is 04.

You may wish to match strings at least one character long that do not contain letters, in which case you want + instead of .:

>>> re.search('[^a-zA-Z]+', word)
<re.Match object; span=(7, 10), match='04)'>

The * character would be used instead of + if you wanted to match zero or more occurrences, instead of one or more occurrences. In this case, using * instead of + produces an empty string, as it matches at the very beginning.

stringr: Removing Parentheses and Brackets from string

We can use |

gsub("\\)|\\]", "", Test)
#[1] "-0.158" "0.426" "1.01" "1.6" "2.18" "2.77"

or instead of escaping place the brackets inside the []

gsub("[][()]", "", Test)
#[1] "-0.158" "0.426" "1.01" "1.6" "2.18" "2.77"

If we want to do the extract instead of removing use either gregexpr/regmatches from base R or str_extract from stringr to check for patterns where a number could start with - and include .

library(stringr)
str_extract(Test, "-?[0-9.]+")
#[1] "-0.158" "0.426" "1.01" "1.6" "2.18" "2.77"

Difficulty to remove several parentheses in a string, using stringr, in R

We can use str_remove_all instead of str_remove as this matches only the first instance

library(stringr)
str_remove_all(x, "[()]")
#[1] "example"

Regex to remove parentheses and inner contents only if contents have certain words

Supposing you're using a regex to replace sub-strings that matches it, you can use: \([^)]*word[^)]*\)

And replace matches with an empty string.

With that regex, you find a block of parentheses that have inside the word, with any character after of before. Any character but a closed parentheses, that would mean the block already ended.



Related Topics



Leave a reply



Submit