R Grep Pattern Regex with Brackets

Regex in R to match strings in square brackets

Though it might be working in certain cases, your pattern looks off to me. I think it should be this:

pattern <- "(\\[.*?\\])"
matches <- gregexpr(pattern, ovl)
overlap <- regmatches(ovl, matches)
overlap_clean <- unlist(overlap)
overlap_clean

[1] "[yes right]" "[ we::ll]" "[°well right° ]"

Demo

This would match and capture a bracketed term, using the Perl lazy dot to make sure we stop at the first closing bracket.

How to search for strings with parentheses in R

As noted by @joran in the comments, the pattern should look like so:

patterns<-c("dog","cat","\\(fish\\)")

The \\s will tell R to read the parentheses literally when searching for the pattern.

Easiest way to achieve this if you don't want to make the change manually:

patterns <- gsub("([()])","\\\\\\1", patterns)

Which will result in:

[1] "dog" "cat" "\\(fish\\)"

If you're not very familiar with regular expressions, what happens here is that it looks for any one character within the the square brackets. The round brackets around that tell it to save whatever it finds that matches the contents. Then, the first four slashes in the second argument tell it to replace what it found with two slashes (each two slashes translate into one slash), and the \\1 tells it to add whatever it saved from the first argument - i.e., either ( or ).

R grep final outer set of brackets

Use stringr library and functions str_extract

library(stringr)
str_extract(vec,paste(c("important1","important2","important3"),collapse="|"))

Resulting in

"important1" "important2" NA           "important3"

if you think you'll have others "important4","important5" etc etc

n<-10
to_match=collapse(paste("important",seq(1,n,by=1),sep=""),"|")
str_extract(vec,to_match)

How to extract string match within brackets in R?

We can use str_replace which would directly extract the elements

library(stringr)    
str_replace(str2, "\\[([^]]+)\\].*", "\\1")
#[1] "\"a\", \"b\""

Or with str_match

str_match(str2, "\\[([^]]+)")[,2]
#[1] "\"a\", \"b\""

data

str2 <- '["a", "b"]'

R stringr regex to extract characters within brackets

You can use

library(stringr)
test <- "asdf asiodjfojewl kjwnkjwnefkjnkf [asdf] fasdfads fewrw [keyword<1] keyword [keyword>1]"
## If the word is right after "[":
str_extract_all(test, "(?<=\\[)keyword[^\\]\\[]*(?=])")
## If the word is anywhere betwee "[" and "]":
str_extract_all(test, "(?<=\\[)[^\\]\\[]*?keyword[^\\]\\[]*(?=])")
## =>
# [[1]]
# [1] "keyword<1" "keyword>1"

See the R demo online.

The regexps match:

  • (?<=\[) - a positive lookbehind that requires a [ char to appear immediately to the left of the current location
  • keyword - a literal string
  • [^\]\[]* - zero or more chars other than [ and ]
  • (?=]) - a positive lookahead that requires a ] char to appear immediately to the right of the current location.

See the online regex demo.

Extract strings in round brackets using regex in R

We can use str_extract to extract the pattern which says an optional number followed by a decimal and then followed by another optional number value. We are using optional ("?") here to get the empty value "(.)".

library(stringr)
vec <- str_extract(yy, "(\\((\\d+)?(\\.(\\d)?\\)))")
vec
#[1] "(.)" NA "(0.5)" NA "(3.2)"

and then use is.na to remove NA elements

vec[!is.na(vec)]
#[1] "(.)" "(0.5)" "(3.2)"

Or using the same regular expression with base R regmatches saves a step to remove NA values.

regmatches(yy, regexpr("(\\((\\d+)?(\\.(\\d)?\\)))", yy))
#[1] "(.)" "(0.5)" "(3.2)"

bracket expressions in grep patterns

I would take a look at one of many grep documentations explaining use of modifiers.

Grep understands three types of regular expressions: basic, extended and PCRE.

With basic regexp in grep, quantifiers such as ? and + have to be escaped with backslashes.

The repetition operators (or quantifiers) are as follows:

? The preceding item is optional and matched at most once.
* The preceding item will be matched zero or more times.
+ The preceding item will be matched one or more times.

..

grep -e '^[[:digit:]]\+[[:space:]]\+foo' foo

The -E modifier Interprets the pattern as an extended regular expression.

grep -E '^[0-9]+\s+foo' foo

Perl one-liner without using grep:

perl -ne '/^[\d ]+foo/ and print' foo


Related Topics



Leave a reply



Submit