If_Else() 'False' Must Be Type Double, Not Integer - in R

if_else() `false` must be type double, not integer - in R

if_else from dplyr is type-stable, meaning that it checks whether the "true" and "false" conditions are the same type. If they aren't, if_else throws an error. ifelse in Base R does not do that.

When writing:

mutate(n = if_else(FiscalYear == "FY2018" & Candy == "SNICKERS", n - 3, n))

I assume n was originally an integer type, so "false" would be of integer type, n-3 coerces "true" to a double, because 3 is double. "true" and "false" are of different types, so if_else throws an error.

When writing:

mutate(qty = if_else(name == "Bob" & fruit == "apple", qty / 2, qty))

qty is likely already a double, so dividing a double by 2 (a double) still yields a double. "true" and "false" are the same type. Hence no error.

With that being said, this can easily be checked with the following typeofs:

> typeof(6)
[1] "double"

> typeof(6L)
[1] "integer"

> typeof(6L-3)
[1] "double"

> typeof(6L-3L)
[1] "integer"

> typeof(6/2)
[1] "double"

ifelse from Base R does implicit coercing, which converts everything to the same type. This means that it doesn't throw an error when "true" and "false" are of different types. This is both more convenient and dangerous as there might be unexpected results after implicit coercing.

I recommend using ifelse for one-off/adhoc programs, and if_else for when you want to take advantage of the built-in unit test.

Convert integer to numeric/double for dplyr::if_else()

But you didn't convert the data frame to numeric, at least not in any of the code you provided. I'll do it for you:

# Read in sample data
chromSizes <- read.table(header = TRUE, text = '
Length
chrIV 1531933
chrXV 1091291
chrVII 1090940
chrXII 1078177
chrXVI 948066
chrXIII 924431
chrII 813184
chrXIV 784333
chrX 745751
chrXI 666816
chrV 576874
chrVIII 562643
chrIX 439888
chrIII 316620
chrVI 270161
chrI 230218
chrM 85779')

# Convert to numeric
chromSizes$Length <- as.numeric(chromSizes$Length)

# Check and see that it is numeric
is.numeric(chromSizes$Length)
# [1] TRUE

Then the if_else() should work:

library(dplyr)

# Sample data
df.b <- read.table(header = TRUE, text = '
Chromosome_Strand Chromosome
x chrIV
y chrXV
z chrVII
- chrXII
a chrXVI
b chrXIII
c chrII
- chrXIV')

# Run if_else condition with numeric 0 FALSE condition
leading4 <- if_else(df.b$Chromosome_Strand == "-", chromSizes[df.b$Chromosome,], 0)

# View results
leading4
# [1] 0 0 0 1078177 0 0 0 784333

Using if_else, I can't return the column used as the conditional if the conditional is false

You seem to be mixing dplyr and base R frameworks.

In base R, you would use

mydata$new.vary <- ifelse(mydata$color == 'E', mydata$position, mydata$color)

This works, but you should take note that mydata$position is a character object, and mydata$color is a factor, which is represented internally as an integer. If you try to run the same code with if_else from the dplyr package, you'll get the following:

mydata$new.vary <- if_else(mydata$color == 'E', mydata$position, mydata$color)
Error: `false` must be type character, not integer

if_else is a little more strict than ifelse, requiring that both the true and false arguments have the same type.

If you want to use the dplyr approach, you can use

mydata %>% 
mutate(new.vary = ifelse(color == 'E', position, color))

or, if you want to use dplyr's if_else, try

mydata %>% 
mutate(color = as.character(color),
new.vary = if_else(color == 'E', position, color))

How to mutate some values of a dataframe based on values from another dataframe column with R

The if_else does type checks. According to ?if_else

Compared to the base ifelse(), this function is more strict. It checks that true and false are the same type. This strictness makes the output type more predictable, and makes it somewhat faster.

and NA by default returns NA_logical_.

typeof(NA)
#[1] "logical"

According to ?NA

NA is a logical constant of length 1 which contains a missing value indicator. NA can be coerced to any other vector type except raw. There are also constants NA_integer_, NA_real_, NA_complex_ and NA_character_ of the other atomic vector types which support missing values: all of these are reserved words in the R language.

We need NA_character_ specifically as there is no coercing to appropriate type (which would normally work with base R ifelse)

typeof(NA_character_)
#[1] "character"

Therefore, it is better to use the appropriate type matched NA

library(dplyr)
df1 %>%
mutate(x = if_else(str_sub(x,3,4) %in% df2$x &
year == 2020, NA_character_, x))

The ifelse doesn't have that issue as the NA automatically is converted to NA_character_

df1 %>%
mutate(x = ifelse(str_sub(x,3,4) %in% df2$x & year == 2020, NA, x))

if_else does not return NA as expected (returns false condition instead)

The operator %in% returns false against the NA value:

test_vector %in% c("1dose", "2dose", "yes")
[1] TRUE TRUE TRUE FALSE FALSE FALSE

I believe str_detect is going to give you the behavior you're looking for:

> if_else(str_detect(test_vector, c("1dose", "2dose", "yes")),"yes","no")
[1] "yes" "yes" "yes" "no" "no" NA

dplyr::if_else - check for condition and insert NA as part of the evaluation

you can coerce the NA into date too, ie:

df %>% mutate(sus_date = if_else(status == "Suspended", date, ymd(NA))) 
date status sus_date
1 2019-01-01 Active <NA>
2 2019-01-02 Suspended 2019-01-02
3 2019-01-03 Active <NA>

Specify class of NA in R (for if_else, dplyr)

you can use NA_real_

if_else(mtcars$cyl > 5, NA_real_, 1)

Avoiding type conflicts with dplyr::case_when

As said in ?case_when:

All RHSs must evaluate to the same type of vector.

You actually have two possibilities:

1) Create new as a numeric vector

df <- df %>% mutate(new = case_when(old == 1 ~ 5,
old == 2 ~ NA_real_,
TRUE ~ as.numeric(old)))

Note that NA_real_ is the numeric version of NA, and that you must convert old to numeric because you created it as an integer in your original dataframe.

You get:

str(df)
# 'data.frame': 3 obs. of 2 variables:
# $ old: int 1 2 3
# $ new: num 5 NA 3

2) Create new as an integer vector

df <- df %>% mutate(new = case_when(old == 1 ~ 5L,
old == 2 ~ NA_integer_,
TRUE ~ old))

Here, 5L forces 5 into the integer type, and NA_integer_ is the integer version of NA.

So this time new is integer:

str(df)
# 'data.frame': 3 obs. of 2 variables:
# $ old: int 1 2 3
# $ new: int 5 NA 3

dplyr if_else() vs base R ifelse()

if_else is more strict. It checks that both alternatives are of the same type and otherwise throws an error, while ifelse will promote types as necessary. This may be a benefit in some circumstances, but may otherwise break scripts if you don't check for errors or explicitly force type conversion. For example:

ifelse(c(TRUE,TRUE,FALSE),"a",3)
[1] "a" "a" "3"
if_else(c(TRUE,TRUE,FALSE),"a",3)
Error: `false` must be type character, not double


Related Topics



Leave a reply



Submit