Numpy: Checking If a Value Is Nat

How to check if Pandas/NumPy arbitrary object contains or IS NaT/NaN/Null

Wrap x inside of list and pass to pd.notna and chain any. It works because pd.notna returns numpy ndarray. Therefore, the any is actually ndarray.any. The When calling numpy ndarray.any with out axis parameter, it will check on all dimensions. Therefore, it works on both list x or single value x

x = [1,2,3,pd.NaT]

In [369]: pd.notna([x])
Out[369]: array([[ True, True, True, False]]) #it is 2d-array

In [370]: type(pd.notna([x]))
Out[370]: numpy.ndarray

In [373]: pd.notna([x]).any() #`ndarray.any` checks on all dimensions of this 2d-array
Out[373]: True

In [374]: pd.notna([x]).all() #`ndarray.all` checks on all dimensions of this 2d-array
Out[374]: False

On x is single pd.NaT

x = pd.NaT

In [377]: pd.notna([x])
Out[377]: array([False]) #it is 1d-array

In [378]: pd.notna([x]).any()
Out[378]: False

In [379]: pd.notna([x]).all()
Out[379]: False

NumPy - Testing equality including np.nan, np.nat, np.NZERO and np.PZERO in a vectorized way

It seems comparing the underlying view does exactly what I want:

def compare(x, y):
x, y = np.broadcast_arrays(x, y)
dtx = x.dtype
dty = y.dtype
if dtx != dty:
return np.zeros(x.shape, dtype=bool)
xv = x.view((np.uint8, x.itemsize))
yv = y.view((np.uint8, y.itemsize))
return np.all(xv == yv, axis=-1)

Efficiently checking if arbitrary object is NaN in Python / numpy / pandas?

pandas.isnull() (also pd.isna(), in newer versions) checks for missing values in both numeric and string/object arrays. From the documentation, it checks for:

NaN in numeric arrays, None/NaN in object arrays

Quick example:

import pandas as pd
import numpy as np
s = pd.Series(['apple', np.nan, 'banana'])
pd.isnull(s)
Out[9]:
0 False
1 True
2 False
dtype: bool

The idea of using numpy.nan to represent missing values is something that pandas introduced, which is why pandas has the tools to deal with it.

Datetimes too (if you use pd.NaT you won't need to specify the dtype)

In [24]: s = Series([Timestamp('20130101'),np.nan,Timestamp('20130102 9:30')],dtype='M8[ns]')

In [25]: s
Out[25]:
0 2013-01-01 00:00:00
1 NaT
2 2013-01-02 09:30:00
dtype: datetime64[ns]``

In [26]: pd.isnull(s)
Out[26]:
0 False
1 True
2 False
dtype: bool

Check if NaT changes to datetime and update value

Use np.where after coercing the dates to datetime.

import numpy as np
df_1['date']=pd.to_datetime(df_1['date'])
df_2['date']=pd.to_datetime(df_2['date'])
df=pd.merge(df_2,df_1, how='left', on='order_id',suffixes=('_left', ''))
df=df.assign(date=np.where(df['date'].isna()|df['date_left'].sub(df['date']).dt.days.gt(0),df['date_left'],df['date'])).drop('date_left',1)



order_id date
0 123 2020-01-02
1 456 2021-01-01
2 789 2020-10-11
3 135 2020-06-01

Checking for both NaT or pandas timestamp

You can use isna or fillna method on it

import pandas as pd
import numpy as np

time = pd.Series(['2017-12-02 20:40:30','2017-12-02 00:00:00',np.nan])
time = time.apply(lambda x: pd.Timestamp(x))
print(time)
0 2017-12-02 20:40:30
1 2017-12-02 00:00:00
2 NaT


time.isna()

0 False
1 False
2 True


time.fillna("missing")

0 2017-12-02 20:40:30
1 2017-12-02 00:00:00
2 missing

How to properly declare 'NaT' in a python function to be applied on a pandas dataframe?

The below code does what you want. also, I made some changes to your code.

def some_fun(x):

if pd.isnull(x):
return 'something else'
else:
return 'something'

df['new_col'] = [some_fun(x) for x in df['date']]

unfortunately, np.isnat() failed in my code. so I used pd.isnull() instead according to this answer. if you think that'll work for you, use np.isnat().

Output:

output



Related Topics



Leave a reply



Submit