Comparing Numpy Arrays Containing Nan

Comparing numpy arrays containing NaN

Alternatively you can use numpy.testing.assert_equal or numpy.testing.assert_array_equal with a try/except:

In : import numpy as np

In : def nan_equal(a,b):
...: try:
...: np.testing.assert_equal(a,b)
...: except AssertionError:
...: return False
...: return True

In : a=np.array([1, 2, np.NaN])

In : b=np.array([1, 2, np.NaN])

In : nan_equal(a,b)
Out: True

In : a=np.array([1, 2, np.NaN])

In : b=np.array([3, 2, np.NaN])

In : nan_equal(a,b)
Out: False

Edit

Since you are using this for unittesting, bare assert (instead of wrapping it to get True/False) might be more natural.

Python\Numpy: Comparing arrays with NAN

Since a and b are lists, a == b isn't returning an array, and so your numpy-like logic won't work:

>>> a == b
False

The command you've quoted only works if they're arrays:

>>> a,b = np.asarray(a), np.asarray(b)
>>> a == b
array([ True, False], dtype=bool)
>>> (a == b) | (np.isnan(a) & np.isnan(b))
array([ True, True], dtype=bool)
>>> ((a == b) | (np.isnan(a) & np.isnan(b))).all()
True

which should work to compare two arrays (either they're both equal or they're both NaN).

How to compare two numpy arrays with some NaN values?

You can use masked arrays, which have the behaviour you're asking for when combined with np.all:

zm = np.ma.masked_where(np.isnan(z), z)

np.all(x == zm) # returns True
np.all(y == zm) # returns False

Or you could just write out your logic explicitly, noting that numpy has to use | instead of or, and the difference in operator precedence that results:

def func(a, b):
return np.all((a == b) | np.isnan(a) | np.isnan(b))

How to compare numpy arrays ignoring nans?

Use np.allclose and np.isnan:

mask = ~(np.isnan(a) | np.isnan(b))
np.allclose(a[mask], b[mask])

This correctly handles +/- inf and allows for small differences. Absolute and relative tolerances can be specified as parameters to allclose.

Comparing NumPy arrays so that NaNs compare equal

If you really care about memory use (e.g. have very large arrays), then you should use numexpr and the following expression will work for you:

np.all(numexpr.evaluate('(a==b)|((a!=a)&(b!=b))'))

I've tested it on very big arrays with length of 3e8, and the code has the same performance on my machine as

np.all(a==b)

and uses the same amount of memory

inequality comparison of numpy array with nan to a scalar

Any comparison (other than !=) of a NaN to a non-NaN value will always return False:

>>> x < -1000
array([False, False, False, True, False, False], dtype=bool)

So you can simply ignore the fact that there are NaNs already in your array and do:

>>> x[x < -1000] = np.nan
>>> x
array([ nan, 1., 2., nan, nan, 5.])

EDIT I don't see any warning when I ran the above, but if you really need to stay away from the NaNs, you can do something like:

mask = ~np.isnan(x)
mask[mask] &= x[mask] < -1000
x[mask] = np.nan

Compare two unequal size numpy arrays and fill the exclusion elements with nan

Here is my solution assuming the first array is always bigger than the second (see comments for general solution, e.g for the second array is bigger on some dimension)

import numpy as np

a = np.arange(18).reshape(6, 3) # 6x3 array
b = np.arange(4).reshape(2, 2) # 2x2 array

# create a resulting array of `nan` values
# in general case, desired shape is
# np.max([a.shape, b.shape], axis=0)
result = np.full(a.shape, np.nan)

# our selection have a shape of the smaller array
# in general case:
# tuple(map(slice, np.min([a.shape, b.shape], axis=0)))
selection = (slice(b.shape[0]), slice(b.shape[1]))

# compare values according the selection
result[selection] = a[selection] == b[selection]

NaNs comparing equal in Numpy

On newer versions of numpy you get this warning:

FutureWarning: numpy equal will not check object identity in the future. The comparison did not return the same result as suggested by the identity (`is`)) and will change.

my guess is that numpy is using id test as a shortcut, for object types before falling back to __eq__ test, and since

>>> id(np.nan) == id(np.nan)
True

it returns true.

if you use float('nan') instead of np.nan the result would be different:

>>> a = np.array([np.nan], dtype=object)
>>> b = np.array([float('nan')], dtype=object)
>>> a == b
array([False], dtype=bool)
>>> id(np.nan) == id(float('nan'))
False


Related Topics



Leave a reply



Submit