Most Efficient Way to Forward-Fill Nan Values in Numpy Array

Most efficient way to forward-fill NaN values in numpy array

Here's one approach -

mask = np.isnan(arr)
idx = np.where(~mask,np.arange(mask.shape[1]),0)
np.maximum.accumulate(idx,axis=1, out=idx)
out = arr[np.arange(idx.shape[0])[:,None], idx]

If you don't want to create another array and just fill the NaNs in arr itself, replace the last step with this -

arr[mask] = arr[np.nonzero(mask)[0], idx[mask]]

Sample input, output -

In [179]: arr
Out[179]:
array([[ 5., nan, nan, 7., 2., 6., 5.],
[ 3., nan, 1., 8., nan, 5., nan],
[ 4., 9., 6., nan, nan, nan, 7.]])

In [180]: out
Out[180]:
array([[ 5., 5., 5., 7., 2., 6., 5.],
[ 3., 3., 1., 8., 8., 5., 5.],
[ 4., 9., 6., 6., 6., 6., 7.]])

Propagate/forward-fill nan values in numpy array

No inbuilt function in numpy to do this. Below simple code will generate desired result using numpy array only.

row,col = arr.shape
mask = np.isnan(arr)
for i in range(1,row):
for j in range(col):
if mask[i][j]:
arr[i][j] =arr[i-1][j]

numpy forward fill with condition

Here's one approach with that window of forward filling as a parameter to handle generic cases -

# https://stackoverflow.com/a/33893692/ @Divakar
def numpy_binary_closing(mask,W):
# Define kernel
K = np.ones(W)

# Perform dilation and threshold at 1
dil = np.convolve(mask,K)>=1

# Perform erosion on the dilated mask array and threshold at given threshold
dil_erd = np.convolve(dil,K)>= W
return dil_erd[W-1:-W+1]

def ffill_windowed(a, W):
mask = a!=0
mask_ext = numpy_binary_closing(mask,W)

p = mask_ext & ~mask
idx = np.maximum.accumulate(mask*np.arange(len(mask)))
out = a.copy()
out[p] = out[idx[p]]
return out

Explanation : The first part does binary-closing operation that's well explored in image-processing domain. So, in our case, we will start off with a mask of non-zeros and image-close based on the window parameter. We get, the indices at all those places where we need to fill by getting forward-filled indices, explored in this post. We put in new values based on the closed-in mask obtained earlier. That's all there is!

Sample runs -

In [142]: a
Out[142]: array([2, 0, 3, 0, 0, 4, 0, 0, 0, 5, 0])

In [143]: ffill_windowed(a, W=2)
Out[143]: array([2, 2, 3, 0, 0, 4, 0, 0, 0, 5, 0])

In [144]: ffill_windowed(a, W=3)
Out[144]: array([2, 2, 3, 3, 3, 4, 0, 0, 0, 5, 0])

In [146]: ffill_windowed(a, W=4)
Out[146]: array([2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 0])

Filling parts of a list without a loop

Array a is the result of a forward fill and array b are indices associated with the range between each consecutive non-zero element.

pandas has a forward fill function, but it should be easy enough to compute with numpy and there are many sources on how to do this.

ll=[7.2,0,0,0,0,0,6.5,0,0,-8.1,0,0,0,0]
a = np.array(ll)

# find zero elements and associated index
mask = a == 0
idx = np.where(~mask, np.arange(mask.size), False)

# do the fill
a[np.maximum.accumulate(idx)]

output:

array([ 7.2,  7.2,  7.2,  7.2,  7.2,  7.2,  6.5,  6.5,  6.5, -8.1, -8.1,
-8.1, -8.1, -8.1])

More information about forward fill is found here:

  • Most efficient way to forward-fill NaN values in numpy array
  • Finding the consecutive zeros in a numpy array

Computing array b you could use the forward fill mask and combine it with a single np.arange:

fill_mask = np.maximum.accumulate(idx)
np.arange(len(fill_mask)) - fill_mask

output:

array([0, 1, 2, 3, 4, 5, 0, 1, 2, 0, 1, 2, 3, 4])

Numpy: nanargmin select indice of 0 if row contains all NaN

Fill the NaNs with infinity using numpy.nan_to_num, then get the argmin:

np.argmin(np.nan_to_num(matrix, nan=float('inf')), axis=1)

output: array([0, 0, 0])

Better way to forward fill a DataFrame/array with calculations?

setup

df = pd.DataFrame(
np.ones((5, 10), dtype=int) * 29,
index=pd.date_range('2016-09-19', periods=5, freq='H'),
columns=range(10))

df

enter image description here


solution

constant = 2
p = np.power(constant, np.arange(1, df.values.shape[1]))
df.iloc[:, 1:] = p * (1 + df.values[:, [0]]) + np.append(0, p[:-1].cumsum())

df

enter image description here



Related Topics



Leave a reply



Submit