in R, check if string appears in row of dataframe (in any column)
Here is an option. Use apply
and match (%in%
).
apply(temp, 1, function(x) any(x %in% "Matt"))
[1] TRUE TRUE FALSE TRUE FALSE
return TRUE when character string present in any column of data.frame
You can do:
library(tidyverse)
df %>%
rowwise() %>%
mutate(col4 = any(str_detect(c_across(c(col1, col2, col3)), 'chicks')),
col5 = any(str_detect(c_across(c(col1, col2, col3)), 'chicks|pair'))) %>%
ungroup()
# A tibble: 5 x 5
col1 col2 col3 col4 col5
<chr> <chr> <chr> <lgl> <lgl>
1 no no chicks TRUE TRUE
2 no no no FALSE FALSE
3 no pair pair FALSE TRUE
4 no no no FALSE FALSE
5 pair no no FALSE TRUE
If Column Contains String then enter value for that row
We can use grepl
to return a logical index by matching the 'D' in the 'A' column, and then with ifelse
, change the logical vector to 'yes' and 'no'
df$C <- ifelse(grepl("D", df$A), "yes", "no")
R function that detects if a dataframe column contains string values from another dataframe column and adds a column that contains the detected str
First you could a column for each of the possible categories to the dataframe with the names, as placeholders (just filled with NA). Then for each of those columns, check whether the column name (so the category) appears in the name. Turn it into a long dataframe, and then remove the FALSE
rows -- those that didn't detect the category in the name.
library(tidyverse)
df1 <- tribble(
~name,
"Apple page",
"Mango page",
"Lychee juice",
"Cranberry club"
)
df2 <- tribble(
~fruit,
"Apple",
"Grapes",
"Strawberry",
"Mango",
"lychee",
"cranberry"
)
fruits <- df2$fruit %>%
str_to_lower() %>%
set_names(rep(NA_character_, length(.)), .)
df1 %>%
add_column(!!!fruits) %>%
mutate(across(-name, ~str_detect(str_to_lower(name), cur_column()))) %>%
pivot_longer(-name, names_to = "category") %>%
filter(value) %>%
select(-value)
#> # A tibble: 4 × 2
#> name category
#> <chr> <chr>
#> 1 Apple page apple
#> 2 Mango page mango
#> 3 Lychee juice lychee
#> 4 Cranberry club cranberry
R - If column contains a string from vector, append flag into another column
Update:
If a list is preferred: Using str_extract_all:
df %>%
transmute(across(-id, ~case_when(str_detect(., pattern) ~ str_extract_all(., pattern)), .names = "new_col{col}"))
gives:
new_colonetext new_colcop new_coltext3
<list> <list> <list>
1 <chr [1]> <NULL> <chr [2]>
2 <chr [2]> <chr [2]> <NULL>
3 <chr [2]> <chr [4]> <chr [5]>
Here is how you could achieve the result:
- create a pattern of the vector
- use
mutate
across
to check the needed columns - if the desired string is detected then extract to a new column !
myvec <- c("cat", "dog", "bird")
pattern <- paste(myvec, collapse="|")
library(dplyr)
library(tidyr)
df %>%
mutate(across(-id, ~case_when(str_detect(., pattern) ~ str_extract_all(., pattern)), .names = "new_col{col}")) %>%
unite(topic, starts_with('new'), na.rm = TRUE, sep = ',')
id onetext cop text3 topic
<dbl> <chr> <chr> <chr> <chr>
1 1 cat furry pink british Little Grey Cat is the nickname given to a kitten of the British Shorthai~ On October 4th the first single topic blog devoted to the little grey cat was lau~ "cat,NULL,c(\"cat\", \"cat\")"
2 2 dog cat fight Dogs have soft fur and tails so do cats Do cats like to chase their tails there are many fights going on and this is just an example text "c(\"dog\", \"cat\"),c(\"cat\", \"cat\"),~
3 3 bird cat issues A cat and bird can coexist in a home but you will have to take certain me~ Some cats will not care about a pet bird at all while others will make it its lif~ "c(\"bird\", \"cat\"),c(\"cat\", \"bird\"~
If any string values in a character vector are in a column of a data frame, return the string that matches in a new column
library(tidyverse)
df <-
tibble(
Category = c("Sales","Marketing"),
Comment = c("i have to use my email everyday and they dont work the poor communication is not acceptable",
"i think the tools are not adequate for the tasks we want to achieve"
)
)
keywords <- c('poor communication', 'email', 'tools', 'hardware', 'software')
df %>%
#Applyng for each row
rowwise() %>%
mutate(
Topic =
#Extract keyword from the string
str_extract(Comment,keywords) %>%
#Remoing NA's
na.omit() %>%
#Paste keywords
paste0(collapse = ", ")
)
check which columns in list have exact string value and extract the column and row
I am not sure if you want things like below
a <- transform(
as.data.frame(
which(matrix(grepl("^11$", as.matrix(df)), nrow = nrow(df)),
arr.ind = TRUE
)),
col = names(df)[col]
)
which gives
> a
row col
1 1 B
2 1 C
3 2 C
Finding rows containing a value (or values) in any column
How about
apply(df, 1, function(r) any(r %in% c("M017", "M018")))
The ith element will be TRUE
if the ith row contains one of the values, and FALSE
otherwise. Or, if you want just the row numbers, enclose the above statement in which(...)
.
Related Topics
How to Prep Transaction Data into Basket for Arules
Cannot Read File with "#" and Space Using Read.Table or Read.CSV in R
Using Variable Value as Column Name in Data.Frame or Cbind
Convert a Mm-Yy String "Jan-01" into Date Format
Missing Data When Supplying a Dual-Axis--Multiple-Traces to Subplot
Match Two Columns with Two Other Columns
How to Replace Certain Values in a Specific Rows and Columns with Na in R
Text Mining R Package & Regex to Handle Replace Smart Curly Quotes
How to Underline Text in a Plot Title or Label? (Ggplot2)
How to Calculate Confidence Intervals for Nonlinear Least Squares in R
Merge Plm Fitted Values to Dataset
Ggplot2: Problem with X Axis When Adding Regression Line Equation on Each Facet
Testing a Function That Uses Enquo() for a Null Parameter
Row Not Consolidating Duplicates in R When Using Multiple Months in Date Filter
How to Make Stacked Barplot with Ggplot2
Fixing a Multiple Warning "Unknown Column"
Ggplot2 PDF Import in Adobe Illustrator Missing Font Adobepistd