How to ignore case when using str_detect?
the search string must be inside function fixed
and that function has valid parameter ignore_case
str_detect('TOYOTA subaru', fixed('toyota', ignore_case=TRUE))
how to make str_detect case insensitive within case_when in r
Also try regex instead of fixed:
dfm %>%
mutate(source = case_when( str_detect(names, regex('email|facebook|instagram', ignore_case = T))~'media',
str_detect(names, 'walmart|costco|target')~ 'store'))
Ignore case with multiple strings using str_detect in R
A possible solution would be to bring everything to lower case and match that with ag|field
.
dat %>%
mutate(Class_2 = case_when(
str_detect(string = str_to_lower(Class),
pattern = "ag|field") ~ "Agricultural",
TRUE ~ Class
))
# A tibble: 3 × 2
Class Class_2
<chr> <chr>
1 ag Agricultural
2 Agricultural--misc Agricultural
3 old field Agricultural
How to properly ignore case in regex
Consider this example :
df <- data.frame(a = c('This is not applicable', 'But this is applicable',
'This is still NOT aPPLicable'))
You need to use regex
in one of the stringr
function like str_detect
here :
library(dplyr)
library(stringr)
df %>% filter(str_detect(a, regex('not applicable',
ignore_case = TRUE), negate = TRUE))
# a
#1 But this is applicable
Or in base R use subset
with grepl
subset(df, !grepl('not applicable', a, ignore.case = TRUE))
Using str_detect on data frame column values
You should first create the regex-pattern to be applied to the column, e.g.
string_detect <- paste(c('denied', 'attach', 'bulk'), collapse = "|")
Then you need to adjust your case_when
code:
folno <- fol_all %>%
mutate(folno_flag = case_when(
str_detect(folno, string_detect) ~ 1,
TRUE ~ 0 ))
Result:
folno folno_flag
1 123denied 1
2 attached_as_test 1
3 dept_224 0
4 bulked_up 1
5 wsd2273 0
You do not need str_detect(folno, string_detect)) == T
because the statement basically says, if str_detect(folno, string_detect))
evaluates to True, then 1, 0 otherwise.
Whole code:
folno <- c('123denied', 'attached_as_test', 'dept_224', 'bulked_up', 'wsd2273')
fol_all <- data.frame(folno)
string_detect <- paste(c('denied', 'attach', 'bulk'), collapse = "|")
fol_all %>%
mutate(folno_flag = case_when(
str_detect(folno, string_detect) ~ 1,
TRUE ~ 0 ))
stringr package using str_detect - Search for one word and exclude word
You are probably looking for a word boundary here (\\b
). Wrap the desired pattern between two word boundaries to match just the word, but not parts of longer words.
library(dplyr)
library(sitrngr)
df %>% mutate(shop_YN = str_detect(remarks, '\\bshop\\b'))
# A tibble: 15 × 3
price remarks shop_YN
<dbl> <chr> <lgl>
1 195000 large home with a 1200 sf shop. great location close to shopping. TRUE
2 213000 updated home close to shopping & schools. FALSE
3 215000 nice location. 2br home with updating. FALSE
4 240000 huge shop on property! TRUE
5 241000 close to shopping. FALSE
6 250000 updated, clean, great location, garage. FALSE
7 255000 close to shopping and massive shop on property. TRUE
8 256500 updated home near shopping, schools, restaurants. FALSE
9 260000 large home with updated interior. FALSE
10 263500 close to schools, updated, stick-built shop 1500sf. TRUE
11 265000 home and shop. TRUE
12 277000 near schools, shopping, restaurants. partially updated home. FALSE
13 280000 located close to shopping. high quality home with shop in backyard. TRUE
14 280000 brick 2-story. lots of shopping near by. detached garage and large shop in back… TRUE
15 150000 fixer! needs work. FALSE
If you want Yes
or No
instead of the logical shop_YN, just pipe the output of str_detect
into ifelse
:
df %>% mutate(shop_YN = str_detect(remarks, '\\bshop\\b') %>% ifelse('Yes', 'No'))
Case-insensitive search of a list in R
Assuming that there are no variable names which differ only in case, you can search your all-lowercase variable name in tolower(names(myDataFrame))
:
match("b", tolower(c("A","B","C")))
[1] 2
This will produce only exact matches, but that is probably desirable in this case.
R match ignore case and special characters
See that @Allan Cameron posted a very similar solution right before me... going to leave this anyways because different enough.. ?!
list1 <- c('a', 'b', 'c')
list2 <- c('A', 'B', 'C')
list3 <- c('a-', 'B_', '- c')
regex to replace any symbol that is not an alphabetic character with an empty string:
f <- function(x) {
return(tolower(gsub("[^[:alpha:]]", "", x)))
}
match(f(list1), f(list2))
match(f(list1), f(list3))
match(f(list2), f(list3))
Related Topics
Displaying Data in the Chart Based on Plotly_Click in R Shiny
Evaluate (I.E., Predict) a Smoothing Spline Outside R
What Is the Knitr Equivalent of 'R Cmd Sweave Myfile.Rnw'
How to Install Multiple Packages
Draw a Chronological Timeline with Ggplot2
Rmarkdown Directing Output File into a Directory
Change Background Color of R Plot
Handling Missing/Incomplete Data in R--Is There Function to Mask But Not Remove Nas
Using Dplyr for Frequency Counts of Interactions, Must Include Zero Counts
Weird As.Posixct Behavior Depending on Daylight Savings Time
Bookmarking and Saving the Bookmarks in R Shiny
Which Library Could Be Used to Make a Chord Diagram in R
Rearrange Dataframe to a Table, the Opposite of "Melt"
R Sequence of Dates with Lubridate
Save All Plots Already Present in the Panel of Rstudio