Dealing with True, False, Na and Nan

Dealing with TRUE, FALSE, NA and NaN

To answer your questions in order:

1) The == operator does indeed not treat NA's as you would expect it to. A very useful function is this compareNA function from r-cookbook.com:

  compareNA <- function(v1,v2) {
# This function returns TRUE wherever elements are the same, including NA's,
# and false everywhere else.
same <- (v1 == v2) | (is.na(v1) & is.na(v2))
same[is.na(same)] <- FALSE
return(same)
}

2) NA stands for "Not available", and is not the same as the general NaN ("not a number"). NA is generally used for a default value for a number to stand in for missing data; NaN's are normally generated because a numerical issue (taking log of -1 or similar).

3) I'm not really sure what you mean by "logical things"--many different data types, including numeric vectors, can be used as input to logical operators. You might want to try reading the R logical operators page: http://stat.ethz.ch/R-manual/R-patched/library/base/html/Logic.html.

Hope this helps!

How to process NA as False in R

For me, I'd think the most beneficial way would be to use a dplyr's case_when function and explicitly state how the NA cases you mention should be handled.

Replicating your example (notice that I'm explicitly setting the NAs here. Your NAs were the result of R not being able to handle a character string ("NA") within a numeric vector.

col1 = as.numeric(c(10, 2, 15, 2, NA_real_, 15))
col2 = as.numeric(c(15, 15, 2, 2, 15, NA_real_))
test <- data.frame(col1, col2)

For both the mutate function and case_when function I'm loading dplyr. If you're not familiar with case_when it's like a ifelse with multiple conditionals. Each conditional is followed by a "~" tilde. What comes after the tilde is what gets assigned if the conditional is met. To set "everything else" as some value X you type TRUE ~ "x" as that obviously gets evaluated as true for all the other cases that have not been met in the previous conditionals.

This should do what you want:

library(dplyr)

test <- mutate(.data = test,
G5 = case_when(col1 > 5 & col2 > 5 ~ "Yes", #Original
(is.na(col1) & col2 > 5) | (col1 > 5 & is.na(col2)) ~ "Yes",
TRUE ~ "No")) # Everything else gets the value "No"


test
#> col1 col2 G5
#> 1 10 15 Yes
#> 2 2 15 No
#> 3 15 2 No
#> 4 2 2 No
#> 5 NA 15 Yes
#> 6 15 NA Yes

Is NaN falsy? Why NaN === false returns false

  1. Falsy and being strictly equal to false are very different things, that's why one has a y instead of an e. ;)
  2. NaN is spec'd to never be equal to anything. The second part of your question is comparing false === false, which is funnily enough, true :)

If you really want to know if something is NaN, you can use Object.is(). Running Object.is(NaN, NaN) returns true.

Preserve NaN values in pandas boolean comparisons

Let's use np.logical_and:

import numpy as np
import pandas as pd
df = pd.DataFrame({'A':[True, True, False, True, np.nan, np.nan],
'B':[True, False, True, np.nan, np.nan, False]})

s = np.logical_and(df['A'],df['B'])
print(s)

Output:

0     True
1 False
2 False
3 NaN
4 NaN
5 False
Name: A, dtype: object

Why do Not a Number values equal True when cast as boolean in Python/Numpy?

This is in no way NumPy-specific, but is consistent with how Python treats NaNs:

In [1]: bool(float('nan'))
Out[1]: True

The rules are spelled out in the documentation.

I think it could be reasonably argued that the truth value of NaN should be False. However, this is not how the language works right now.

Why is NAN unequal to everything except true, in PHP?

NAN (quite NAN or signaling NAN) is a non-zero floating point value.
* That is why *

sqrt(-1.0) -> NAN

There is -NAN and +NAN although since about 80286, it is just usually recognized as NAN on test.

Check your FPU floating point instruction set if you need to.

+INF and -INF are also non-zero floating point values:

- log(0.0) -> +INF
log(0.0) -> -INF

Here is a dump of the Intel floating point stack. I'll just list the few values I was talking about: (don't forget, internally, FPU is 10 bytes):

     <exp>  <mantissa>
0.0 00 00 00 00 00 00 00 00 00 00
-INF FF FF 80 00 00 00 00 00 00 00
+INF 7F FF 80 00 00 00 00 00 00 00
-NAN FF FF C0 00 00 00 00 00 00 00
+NAN 7F FF C0 00 00 00 00 00 00 00

So as you can see, only 0.0 is ZERO!



Related Topics



Leave a reply



Submit