How to Loop Through Columns, Check If a Particular Value Exists in Any of the Columns, Mutate a New Column and Enter 1 If It Exists, 0 If Not

How to loop through columns, check if a particular value exists in any of the columns, mutate a new column and enter 1 if it exists, 0 if not?

1) Assuming DF is as shown reproducibly in the Note at the end use sapply to create an matrix of indicators and then cbind it to the ID column. Finally make the names nicer. No packages are used.

ques <- function(i) paste0("Q", 1:25) %in% unlist(DF[i, -1])
DFout <- cbind(DF[1], +t(sapply(1:nrow(DF), ques)))
names(DFout)[-1] <- paste0("Q", names(DFout[-1]))

The first 5 columns are:

> DFout[1:5]

ID Q1 Q2 Q3 Q4
1 100 1 0 0 0
2 101 0 0 1 1
3 102 0 1 1 1
4 103 0 0 0 0
5 104 0 0 0 1
6 105 1 1 1 1

2) Another possibility is to convert the input to long form and then use xtabs to create a table from it.

library(dplyr)
library(tidyr)

tab <- DF %>%
gather(key, Question, -ID) %>%
filter(nzchar(Question)) %>%
mutate(Question = factor(Question, paste0("Q", 1:25))) %>%
xtabs(~ ID + Question, .)

giving this table. We show the first 5 columns:

> tab[, 1:5]

Question
ID Q1 Q2 Q3 Q4
100 1 0 0 0
101 0 0 1 1
102 0 1 1 1
104 0 0 0 1
105 1 1 1 1

If it is important that the result be a data frame then add:

library(tibble)

tab %>%
as.data.frame.matrix %>%
rownames_to_column(var = "ID")

Note

sample <- "rows ID   Col1  Col2  Col3  Col4
1 100 Q1
2 101 Q3 Q4
3 102 Q2 Q3 Q4
4 103
5 104 Q4
6 105 Q1 Q2 Q3 Q4"
DF <- read.table(text = sample, header = TRUE, fill = TRUE, as.is = TRUE,
strip.white = TRUE)[-1]

How to iterate through string of one column to mutate/modify other columns value?

Adapting this answer to your example. You can use separate_rows then pivot_wider.

library(tidyverse)

example_df %>%
select(x0, x1) %>%
separate_rows(x1) %>%
pivot_wider(names_from = x1,
values_from = x1,
values_fn = list(x1 = is.character),
values_fill = list(x1 = FALSE))


#----------
# A tibble: 5 x 4
x0 Dog Cat Bird
<dbl> <lgl> <lgl> <lgl>
1 1 TRUE TRUE FALSE
2 2 FALSE TRUE FALSE
3 3 TRUE FALSE TRUE
4 4 TRUE TRUE TRUE
5 5 FALSE TRUE TRUE

How do you evaluate a set of conditions and create a new column based on a list of columns?

We can just use rowSums to make this efficient

i1 <- startsWith(names(df), 'col')
c( "Failed", "Passed")[(rowSums(df[i1] == 1) == 3) + 1]
#[1] "Failed" "Passed" "Failed"

Or another base R efficient option is Reduce

c("Failed", "Passed")[Reduce(`&`, df[i1]) +1]
#[1] "Failed" "Passed" "Failed"

NOTE: Both base R solutions are compact and are very efficient


Or with &

library(dplyr)
df %>%
mutate(evaluation = c('Failed', 'Passed')[1 + (col1 & col2 & col3)])
# ignore1 ignore2 col1 col2 col3 evaluation
#1 1 1 0 0 0 Failed
#2 0 0 1 1 1 Passed
#3 0 1 0 1 0 Failed

Or we can have rowSums within dplyr

df %>%
mutate(evaluation = c("Failed", "Passed")[(rowSums(.[i1] == 1) == 3) + 1])

NOTE: Both the solutions are very efficient and doesn't use any packages that are not really needed

Or if we need some packages, then use magrittr with purrr

library(magrittr)
library(purrr)
df %>%
mutate(evaluation = select(., starts_with('col')) %>%
reduce(`&`) %>%
add(1) %>%
c("Failed", "Passed")[.])
# ignore1 ignore2 col1 col2 col3 evaluation
#1 1 1 0 0 0 Failed
#2 0 0 1 1 1 Passed
#3 0 1 0 1 0 Failed

NOTE: Here also, we are not looping over rows, so it should be efficient

create a column where each value is the result of an if statement in R

replace semicolon ";" with "}".
and for else open a new "{"
here is your code it works well:

mtcars %>%
rowwise() %>%
mutate(strand =
if (drat < wt ){
print("-")} else{
print("+")
})

Perform mutate function only if variable exists

Since functions are unique to each variable and you want to return remaining values if one of the columns fail can't really come up with better solution than to use tryCatch on individual columns.

library(dplyr)

convert_columns <- function(df) {
df %>%
mutate(
a = tryCatch(convert_a(a),error = function(z) return(NA)),
b = tryCatch(convert_b(b),error = function(z) return(NA)),
c = tryCatch(convert_c(c),error = function(z) return(NA)),
#...
#...
)
}

This can be tested using the following mtcars example :

This works -

mtcars %>%
mutate(a = n_distinct(cyl),
b = mean(mpg),
c = sd(am))

Now if we remove one of the column, the above fails :

mtcars %>%
select(-am) %>%
mutate(a = n_distinct(cyl),
b = mean(mpg),
c = sd(am))

Error: Problem with mutate() input c.
x cannot coerce type 'closure' to vector of type 'double'
ℹ Input c is sd(am).

Now using tryCatch

mtcars %>%
select(-am) %>%
mutate(a = tryCatch(n_distinct(cyl), error = function(e) return(NA)),
b = tryCatch(mean(mpg), error = function(e) return(NA)),
c = tryCatch(sd(am), error = function(e) return(NA)))

# mpg cyl disp hp drat wt qsec vs gear carb a b c
#1 21 6 160 110 3.9 2.6 16 0 4 4 3 20 NA
#2 21 6 160 110 3.9 2.9 17 0 4 4 3 20 NA
#3 23 4 108 93 3.9 2.3 19 1 4 1 3 20 NA
#4 21 6 258 110 3.1 3.2 19 1 3 1 3 20 NA
#....

Dynamically determine if a dataframe column exists and mutate if it does

You can use mutate_at with one_of, both from dplyr. This will mutate column only if it matches with one of c("first_name", "last_name"). If no match, it will generate a simple warning but you can ignore or suppress it.

library(dplyr)

d %>%
mutate_at(vars(one_of(c("first_name", "last_name")), toupper)

id col_name last_name
1 19 7461 V
2 52 9651 H
3 56 1901 P
4 13 7866 Z
5 25 9527 U

# example with no match
b %>%
mutate_at(vars(one_of(c("first_name", "last_name"))), toupper)

id col_name another_col
1 34 9315 8686
2 26 5598 4124
3 17 3318 2182
4 32 1418 4369
5 49 4759 6680
Warning message:
Unknown variables: `first_name`, `last_name`

Here are a bunch of other ?select_helpers in dplyr -

These functions allow you to select variables based on their names.

starts_with(): starts with a prefix

ends_with(): ends with a prefix

contains(): contains a literal string

matches(): matches a regular expression

num_range(): a numerical range like x01, x02, x03.

one_of(): variables in character vector.

everything(): all variables.



Related Topics



Leave a reply



Submit