How to Ignore Na in Ifelse Statement

How to ignore NA in ifelse statement

This syntax is easier to read:

x <- c(NA, 1, 0, -1)

(x > 0) & (!is.na(x))
# [1] FALSE TRUE FALSE FALSE

(The outer parentheses aren't necessary, but will make the statement easier to read for almost anyone other than the machine.)


Edit:

## If you want 0s and 1s
((x > 0) & (!is.na(x))) * 1
# [1] 0 1 0 0

Finally, you can make the whole thing into a function:

isPos <- function(x) {
(x > 0) & (!is.na(x)) * 1
}

isPos(x)
# [1] 0 1 0 0

Direct way of telling ifelse to ignore NA

You can use %in% instead of == to sort-of ignore NAs.

ifelse(df$a %in% 1, "a==1", 
ifelse(df$b %in% 1, "b==1",
ifelse(df$c %in% 1, "c==1", NA)))

Unfortunately, this does not give any performance gain compared to the original while @arkun's solution is about 3 times faster.

solution_original <- function(){
ifelse(df$a==1 & !is.na(df$a), "a==1",
ifelse(df$b==1 & !is.na(df$b), "b==1",
ifelse(df$c==1 & !is.na(df$c), "c==1", NA)))
}

solution_akrun <- function(){
v1 <- names(df)[max.col(!is.na(df)) * NA^!rowSums(!is.na(df))]
i1 <- !is.na(v1)
v1[i1] <- paste0(v1[i1], "==1")
}

solution_mine <- function(x){
ifelse(df$a %in% 1, "a==1",
ifelse(df$b %in% 1, "b==1",
ifelse(df$c %in% 1, "c==1", NA)))
}
set.seed(1)
df <- data.frame(a = sample(c(1, rep(NA, 4)), 1e6, T),
b = sample(c(1, rep(NA, 4)), 1e6, T),
c = sample(c(1, rep(NA, 4)), 1e6, T))
microbenchmark::microbenchmark(
solution_original(),
solution_akrun(),
solution_mine()
)
## Unit: milliseconds
## expr min lq mean median uq max neval
## solution_original() 701.9413 839.3715 845.0720 853.1960 875.6151 1051.6659 100
## solution_akrun() 217.4129 242.5113 293.2987 253.2144 387.1598 564.3981 100
## solution_mine() 698.7628 845.0822 848.6717 858.7892 877.9676 1006.2872 100

Was inspired by this: R: Dealing with TRUE, FALSE, NA and NaN

Edit

Following the comment by @arkun, I redid the benchmark and revised the statement.

Ignoring NAs in an Ifelse statement R to be applied over a list of dataframes R

I think you should use if(any(...)) as condition since you want to check if any of the value in ZScore is greater than ZMax. NA values can be ignored with na.rm = TRUE in any.

ZMax <- 3.5

FinalStats <- function(x,...){
unlistdata <- unlist(x[-1])
GrandMean <- mean(unlistdata,na.rm = T)
GrandSD <- sd(unlistdata,na.rm=T)
ZScore <- abs(((x[-1])-GrandMean)/GrandSD)
if(any(ZScore > ZMax, na.rm = TRUE)){
LabMean <- mapply(mean, x[-1], na.rm = T) #Calculate Mean by columns
SD.All <- unlist(x[-1])
ConsensusValue <- mean(LabMean)
Uncertainty <- sd(SD.All, na.rm = T)
}else{
LabMedian <- mapply(median, x[-1], na.rm = T) #Calculate Median by columns
LabMedian[is.infinite(LabMedian)] <- NA #convert any Inf values to NA
SD.All <- unlist(x[-1])
ConsensusValue <- LabMedian
Uncertainty <- sd(SD.All, na.rm = T)
}

FinalValues <- cbind(ConsensusValue,Uncertainty) #combined the desired Info

return(FinalValues)
}

This returns -

CatergoreisStats <- lapply(df,FinalStats)
CatergoreisStats
#$Al2O3
# ConsensusValue Uncertainty
#[1,] 2.088453 0.03880474

#$As
# ConsensusValue Uncertainty
#2 0.0010 0.001475832
#3 0.0020 0.001475832
#4 0.0020 0.001475832
#5 0.0010 0.001475832
#7 NA 0.001475832
#8 0.0010 0.001475832
#10 NA 0.001475832
#12 NA 0.001475832
#36 0.0053 0.001475832

#$Ba
# ConsensusValue Uncertainty
#2 NA 0.001303559
#3 0.00100000 0.001303559
#4 0.00300000 0.001303559
#5 0.00300000 0.001303559
#7 NA 0.001303559
#8 0.00200000 0.001303559
#10 NA 0.001303559
#12 NA 0.001303559
#36 0.00089566 0.001303559

Ignore NA in Ifelse statement- R

You could solve this in two ways I think:

1) Another ifelse before this to check for NAs - something like:

ww.LIG = ifelse( is.na(Accel2$wk.VWD) | is.na(Accel2$we.VWD), NA,
ifelse( (Accel2$wk.VWD >= 3 & Accel2$we.VWD >= 0 )
| ( Accel2$wk.VWD >=2 & Accel2$we.VWD >=1 )
| ( Accel2$wk.VWD >=1 & Accel2$we.VWD >=2) ,
(Accel2$wk.LIG + Accel2$we.LIG)/2, NA))

2) Remove the NA rows to start with - something like:

df = complete.cases(data.frame(wkVWD = Accel2$wk.VWD, weVWD = Accel2$we.VWD, Accel2$wk.LIG, weLIG = Accel2$we.LIG))
df$wwLIG = ifelse( (df$wkVWD >= 3 & df$weVWD >= 0 )
| ( df$wkVWD >=2 & df$weVWD >=1 )
| ( df$wkVWD >=1 & df$weVWD >=2) ,
(df$wkLIG + df$weLIG)/2, NA)

Does that work for you?

R handling NA values while doing a comparison ifelse

Please see the following SO post: How to ignore NA in ifelse statement

With respect to your question:

df$counting <- ifelse(df$age > 5 & df$age < 8 & !is.na(df$age), 1, 0) + ifelse(df$marks > 60 & df$marks < 70, 1, 0)
> df
sex occupation age marks counting
1 M Student NA 34 0
2 F Analyst 6 65 2
3 M Analyst 9 21 0

How to make an ifelse statement ignore NAs?

First solution looks like:

df %>%
mutate(prev_PC = case_when(changed_PC == "No" &
is.na(prev_PC) ~ new_PC,
TRUE ~ prev_PC))

but this is probably better:

df %>% 
mutate(prev_PC = if_else(is.na(prev_PC) &
changed_PC == "No", new_PC, prev_PC))

Resulting in:

> df %>% 
+ print() %>%
+ mutate(prev_PC = if_else(is.na(prev_PC) &
+ changed_PC == "No", new_PC, prev_PC))
prev_PC new_PC changed_PC
1 5039 5039 No
2 1402 1402 <NA>
3 3050 3050 No
4 NA 3021 Yes
5 NA 2154 <NA>
6 NA 4853 <NA>
7 NA 1252 No
8 NA 2954 No
prev_PC new_PC changed_PC
1 5039 5039 No
2 1402 1402 <NA>
3 3050 3050 No
4 NA 3021 Yes
5 NA 2154 <NA>
6 NA 4853 <NA>
7 1252 1252 No
8 2954 2954 No

(see the before and after)

How to include NA in ifelse?

You can't really compare NA with another value, so using == would not work. Consider the following:

NA == NA
# [1] NA

You can just change your comparison from == to %in%:

ifelse(is.na(test$time) | test$type %in% "A", NA, "1")
# [1] NA "1" NA "1"

Regarding your other question,

I could get this to work with my existing code if I could somehow change the result of is.na(test$type) to return FALSE instead of TRUE, but I'm not sure how to do that.

just use ! to negate the results:

!is.na(test$time)
# [1] TRUE TRUE FALSE TRUE

How to handle or ignore NAs when using ifelse to mutate a new column with multiple conditions (solved)

You can use rowMeans() in place of if_else() which will handle cases that are all NA.

z %>% 
mutate(age_event = +(rowMeans(. < 18, na.rm = TRUE) > 0))

j6 j7 j8 age_event
1 6 27 8 1
2 19 20 22 0
3 NA NA NA NA
4 NA 7 20 1
5 NA 19 NA 0
6 NA NA 8 1
7 NA NA 30 0
8 8 20 NA 1
9 20 30 NA 0
10 20 9 NA 1
11 NA NA 3 1


Related Topics



Leave a reply



Submit