Regex in R to match strings in square brackets
Though it might be working in certain cases, your pattern looks off to me. I think it should be this:
pattern <- "(\\[.*?\\])"
matches <- gregexpr(pattern, ovl)
overlap <- regmatches(ovl, matches)
overlap_clean <- unlist(overlap)
overlap_clean
[1] "[yes right]" "[ we::ll]" "[°well right° ]"
Demo
This would match and capture a bracketed term, using the Perl lazy dot to make sure we stop at the first closing bracket.
How to search for strings with parentheses in R
As noted by @joran in the comments, the pattern should look like so:
patterns<-c("dog","cat","\\(fish\\)")
The \\
s will tell R to read the parentheses literally when searching for the pattern.
Easiest way to achieve this if you don't want to make the change manually:
patterns <- gsub("([()])","\\\\\\1", patterns)
Which will result in:
[1] "dog" "cat" "\\(fish\\)"
If you're not very familiar with regular expressions, what happens here is that it looks for any one character within the the square brackets. The round brackets around that tell it to save whatever it finds that matches the contents. Then, the first four slashes in the second argument tell it to replace what it found with two slashes (each two slashes translate into one slash), and the \\1
tells it to add whatever it saved from the first argument - i.e., either (
or )
.
R grep final outer set of brackets
Use stringr
library and functions str_extract
library(stringr)
str_extract(vec,paste(c("important1","important2","important3"),collapse="|"))
Resulting in
"important1" "important2" NA "important3"
if you think you'll have others "important4","important5" etc etc
n<-10
to_match=collapse(paste("important",seq(1,n,by=1),sep=""),"|")
str_extract(vec,to_match)
How to extract string match within brackets in R?
We can use str_replace
which would directly extract the elements
library(stringr)
str_replace(str2, "\\[([^]]+)\\].*", "\\1")
#[1] "\"a\", \"b\""
Or with str_match
str_match(str2, "\\[([^]]+)")[,2]
#[1] "\"a\", \"b\""
data
str2 <- '["a", "b"]'
R stringr regex to extract characters within brackets
You can use
library(stringr)
test <- "asdf asiodjfojewl kjwnkjwnefkjnkf [asdf] fasdfads fewrw [keyword<1] keyword [keyword>1]"
## If the word is right after "[":
str_extract_all(test, "(?<=\\[)keyword[^\\]\\[]*(?=])")
## If the word is anywhere betwee "[" and "]":
str_extract_all(test, "(?<=\\[)[^\\]\\[]*?keyword[^\\]\\[]*(?=])")
## =>
# [[1]]
# [1] "keyword<1" "keyword>1"
See the R demo online.
The regexps match:
(?<=\[)
- a positive lookbehind that requires a[
char to appear immediately to the left of the current locationkeyword
- a literal string[^\]\[]*
- zero or more chars other than[
and]
(?=])
- a positive lookahead that requires a]
char to appear immediately to the right of the current location.
See the online regex demo.
Extract strings in round brackets using regex in R
We can use str_extract
to extract the pattern which says an optional number followed by a decimal and then followed by another optional number value. We are using optional ("?") here to get the empty value "(.)".
library(stringr)
vec <- str_extract(yy, "(\\((\\d+)?(\\.(\\d)?\\)))")
vec
#[1] "(.)" NA "(0.5)" NA "(3.2)"
and then use is.na
to remove NA
elements
vec[!is.na(vec)]
#[1] "(.)" "(0.5)" "(3.2)"
Or using the same regular expression with base R regmatches
saves a step to remove NA
values.
regmatches(yy, regexpr("(\\((\\d+)?(\\.(\\d)?\\)))", yy))
#[1] "(.)" "(0.5)" "(3.2)"
bracket expressions in grep patterns
I would take a look at one of many grep documentations explaining use of modifiers.
Grep
understands three types of regular expressions: basic
, extended
and PCRE
.
With basic regexp in grep, quantifiers such as ?
and +
have to be escaped with backslashes.
The repetition operators (or quantifiers) are as follows:
? The preceding item is optional and matched at most once.
* The preceding item will be matched zero or more times.
+ The preceding item will be matched one or more times.
..
grep -e '^[[:digit:]]\+[[:space:]]\+foo' foo
The -E
modifier Interprets the pattern as an extended regular expression.
grep -E '^[0-9]+\s+foo' foo
Perl one-liner without using grep:
perl -ne '/^[\d ]+foo/ and print' foo
Related Topics
Extract the Coefficients for the Best Tuning Parameters of a Glmnet Model in Caret
Calculating Peaks in Histograms or Density Functions
Deleting Specific Rows from a Data Frame
Identifying the Outliers in a Data Set in R
Rcpp Function to Select (And to Return) a Sub-Dataframe
Apply Function to Elements Over a List
All Possible Combinations of a Set That Sum to a Target Value
Trouble Passing on an Argument to Function Within Own Function
How to Select Non-Numeric Columns Using Dplyr::Select_If
R Grep Pattern Regex with Brackets
Overlay Grid Rather Than Draw on Top of It
New R-Studio Version 0.98.932 Deletes .Md File - How to Prevent
Ordering Stacks by Size in a Ggplot2 Stacked Bar Graph
"'\W' Is an Unrecognized Escape" in Grep
Obtaining Connected Components of Neighboring Values
Displaying True When Shiny Files Are Split into Different Folders