str_detect for multiple patterns
You need to use the | separator in your search, all within one set of "".
> words <- c("quantity", "single", "double", "triple", "awful")
> set.seed(1234)
> df = tibble(col = sample(words,10, replace = TRUE))
> df
# A tibble: 10 x 1
col
<chr>
1 triple
2 single
3 awful
4 triple
5 quantity
6 awful
7 triple
8 single
9 single
10 triple
> df %>% filter(str_detect(col, "quantity|single"))
# A tibble: 4 x 1
col
<chr>
1 single
2 quantity
3 single
4 single
Ignore case with multiple strings using str_detect in R
A possible solution would be to bring everything to lower case and match that with ag|field
.
dat %>%
mutate(Class_2 = case_when(
str_detect(string = str_to_lower(Class),
pattern = "ag|field") ~ "Agricultural",
TRUE ~ Class
))
# A tibble: 3 × 2
Class Class_2
<chr> <chr>
1 ag Agricultural
2 Agricultural--misc Agricultural
3 old field Agricultural
how to use str_detect within across when searching multiple columns for several search strings
Combine across
with Reduce
to select rows which has any occurrence of the pattern.
library(dplyr)
library(stringr)
pat <- paste(search_string, collapse = "|")
raw_df %>%
filter(Reduce(`|`, across(c(cust_name, other_desc),
~str_detect(., regex(pat, ignore_case = TRUE)))))
However, I think using if_any
is more suitable here as it was build to handle such cases -
raw_df %>%
filter(if_any(c(cust_name, other_desc),
~str_detect(., regex(pat, ignore_case = TRUE))))
# cust_name other_desc trans val
# <chr> <chr> <chr> <int>
#1 Cisco nothing a 100
#2 bad_cs cisCo s 101
#3 Ibm nothing d 102
#4 bad_ib ibM f 102
str_detect with multiple strings (and not or) of the same kind using R
You may use -
library(stringr)
str_detect(find.variable, '\\bdetect\\b.*\\bdetect\\b')
#[1] TRUE FALSE FALSE TRUE FALSE
If you want to allow consecutive values of 'detect'
, use
str_detect(find.variable, 'detect.*detect')
You can also use str_count
to count number of detects in a string.
str_count(find.variable, 'detect') == 2
#[1] TRUE FALSE FALSE TRUE TRUE
Note that the last value is TRUE
in case of str_count
.
Detect multiple strings with dplyr and stringr
str_detect
only accepts a length-1 pattern. Either turn it into one regex using paste(..., collapse = '|')
or use any
:
sapply(test.data$item, function(x) any(sapply(fruit, str_detect, string = x)))
# Apple Bear Orange Pear Two Apples
# TRUE FALSE TRUE TRUE TRUE
str_detect(test.data$item, paste(fruit, collapse = '|'))
# [1] TRUE FALSE TRUE TRUE TRUE
Filter by multiple patterns with filter() and str_detect()
The correct syntax to accomplish this with filter() and str_detect() would be
df %>%
filter(
str_detect(letters, "a|f|o")
)
# numbers letters
#1 1 a
#2 6 f
#3 15 o
#4 27 a
#5 32 f
#6 41 o
R exact match for multiple patterns
We could use the word boundary (\\b
) to avoid the unnecessary partial matches
str_detect(myfile,paste0("\\b(", paste(toMatch, collapse="|"), ")\\b"))
[1] TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
Based on the elements used, it can be done with %in%
myfile %in% toMatch
1] TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
summarise to sum cells containing multiple strings
Use str_detect
separately for &
condition.
library(dplyr)
library(stringr)
summarise("Total" = n(),
"CD8:PD-1" = sum(str_detect(ordered, "PD-1") &
str_detect(ordered, "CD8"),na.rm = TRUE),
"CD8:PD:1:FoxP3" = sum(str_detect(ordered, "PD-1") &
str_detect(ordered, "CD8") &
str_detect(ordered, "FoxP3"), na.rm = TRUE))
Related Topics
R: How to Total the Number of Na in Each Col of Data.Frame
R: Using Rgl to Generate 3D Rotatable Plots That Can Be Viewed in a Web Browser
Minimal Example of Rpy2 Regression Using Pandas Data Frame
Reading Excel File: How to Find the Start Cell in Messy Spreadsheets
Is There a Reason to Prefer Extractor Functions to Accessing Attributes with $
Passing Large Matrices to Rcpparmadillo Function Without Creating Copy (Advanced Constructors)
How to Create a New Column Based on Multiple Conditions from Multiple Columns
Subscripts and Superscripts "-" or "+" with Ggplot2 Axis Labels? (Ionic Chemical Notation)
Display Correlation Tables as Descending List
Dynamically Add Function to R6 Class Instance
Calculating the Difference Between Consecutive Rows by Group Using Dplyr
R: Xtable Caption (Or Comment)
Format a Date Column in a Data Frame
Extracting Noun+Noun or (Adj|Noun)+Noun from Text