Filter a Vector of Strings Based on String Matching

Filter a vector of strings based on string matching

you can use grepl with regular expression:

X[grepl("^m.*\\.log", X)]

Filter data.table based on string match from another vector

You can create a pattern dynamically from y.

library(data.table)
pat <- sprintf('^(%s)', paste0(y, collapse = '|'))
pat
#[1] "^(a|b)"

and use it to subset the data.

dt[grepl(pat, s)]

# x s
#1: 1 a
#2: 2 ab
#3: 3 b.c

r - Filter rows that contain a string from a vector

We can use grep

df1[grep(paste(v1, collapse="|"), df1$animal),]

Or using dplyr

df1 %>%
filter(grepl(paste(v1, collapse="|"), animal))

Filter vector elements containing and not containing multiple strings

"foo" OR "bar" without "cpp" and "quux":

filenames[grepl("foo|bar",filenames)&!grepl("cpp|quux",filenames)]
[1] "foo.txt" "bar.R" "foo_bar"

"foo" AND "bar" without "cpp" and "quux":

filenames[grepl("(?=.*foo)(?=.*bar)",filenames,perl = T)&!grepl("cpp|quux",filenames)]
[1] "foo_bar"

Filter character vector based on first two elements

You might want to use Regular Expression (regex) to find strings that start with "01" or "02".

Base approach is use grep(), which returns indices of strings that match a pattern. Here's an example - notice I've changed the 2nd and 4th data elements to demonstrate how just searching for "01" or "02" will lead to incorrect answer:

d <- c("0115", "0102", "0256", "0201")

grep("01", d)
#> [1] 1 2 4

d[grep("01", d)]
#> [1] "0115" "0102" "0201"

Because this searches for "01" anywhere, you get "0201" in the mix. To avoid, add "^" to the pattern to specify that the string starts with "01":

grep("^01", d)
#> [1] 1 2

d[grep("^01", d)]
#> [1] "0115" "0102"

If you use the stringr package, you can also use str_detect() in the same way:

library(stringr)

d[str_detect(d, "^01")]
#> [1] "0115" "0102"


Related Topics



Leave a reply



Submit