Forward Fill Specific Columns in Pandas Dataframe

forward fill specific columns in pandas dataframe

tl;dr:

cols = ['X', 'Y']
df.loc[:,cols] = df.loc[:,cols].ffill()

And I have also added a self containing example:

>>> import pandas as pd
>>> import numpy as np
>>> 
>>> ## create dataframe
... ts1 = [0, 1, np.nan, np.nan, np.nan, np.nan]
>>> ts2 = [0, 2, np.nan, 3, np.nan, np.nan]
>>> d =  {'X': ts1, 'Y': ts2, 'Z': ts2}
>>> df = pd.DataFrame(data=d)
>>> print(df.head())
    X   Y   Z
0   0   0   0
1   1   2   2
2 NaN NaN NaN
3 NaN   3   3
4 NaN NaN NaN
>>> 
>>> ## apply forward fill
... cols = ['X', 'Y']
>>> df.loc[:,cols] = df.loc[:,cols].ffill()
>>> print(df.head())
   X  Y   Z
0  0  0   0
1  1  2   2
2  1  2 NaN
3  1  3   3
4  1  3 NaN

Forward fill on specific column for specific rows

df = df.replace('na', np.nan)
df['num2'] = df.groupby('Color')['num2'].ffill()

Output:

>>> df
    Color  num1 num2
0     red     1    2
1     red     1    2
2    blue     2  NaN
3    blue     2    3
4  yellow     1    4
5  yellow     1    4

Forward fill blocks of above values pandas

You can create consecutive values for missing and not missing values, then create counter per columns and forward filling missing values per groups:

df = pd.DataFrame([[1, 2, 3], [4, None, 8], [None, 5, 9], [None,None,10],
                   [0, 2, None], [5, None, None], [None, 5, None], [None,None,None]])

print (df)
     0    1     2
0  1.0  2.0   3.0
1  4.0  NaN   8.0
2  NaN  5.0   9.0
3  NaN  NaN  10.0
4  0.0  2.0   NaN
5  5.0  NaN   NaN
6  NaN  5.0   NaN
7  NaN  NaN   NaN

m = df.isna()
g = m.ne(m.shift()).cumsum()
for c in df.columns:
    df[c] = df.groupby(g.groupby(c).cumcount())[c].ffill()

print (df)
     0    1     2
0  1.0  2.0   3.0
1  4.0  2.0   8.0
2  1.0  5.0   9.0
3  4.0  5.0  10.0
4  0.0  2.0   3.0
5  5.0  2.0   8.0
6  0.0  5.0   9.0
7  5.0  5.0  10.0

EDIT: New solution repeat non missing values by newxt missing values per groups creted by first non missing value, here is used numpy.tile for generate sequences:

df = pd.DataFrame([[1, 2, 3], [4, None, 8], [None, 5, 9], [7,None,10],
                   [0, 2, None], [5, None, None], [None, 6, None], [None,8,None]
                   , [None,None,None], [None,None,None]])
print (df)
     0    1     2
0  1.0  2.0   3.0
1  4.0  NaN   8.0
2  NaN  5.0   9.0
3  7.0  NaN  10.0
4  0.0  2.0   NaN
5  5.0  NaN   NaN
6  NaN  6.0   NaN
7  NaN  8.0   NaN
8  NaN  NaN   NaN
9  NaN  NaN   NaN

g = (df.notna() & df.shift().isna()).cumsum()

def f(x):
    non_miss = x.dropna()
    return np.tile(non_miss, int(len(x) // len(non_miss) + 2))[:len(x)]

df = df.apply(lambda x: x.groupby(g[x.name]).transform(f))
print (df)
     0    1     2
0  1.0  2.0   3.0
1  4.0  2.0   8.0
2  1.0  5.0   9.0
3  7.0  5.0  10.0
4  0.0  2.0   3.0
5  5.0  2.0   8.0
6  7.0  6.0   9.0
7  0.0  8.0  10.0
8  5.0  6.0   3.0
9  7.0  8.0   8.0

Forward fill only certain value

mask = (df.ffill() == 0) should only be suffice to fulfill your usecase.

Firstly, df.ffill will propagate the last valid observation forward. So rows followed by 0 will be filled by 0s, and rows followed by 1 will be filled by 1s. Compare that to 0 to select rows with 0s only and use it as mask to get your final df.

Example: (Added a 0 and few NaNs to the end of your df)

>>> s = [np.nan, 0, np.nan, np.nan, 1, np.nan, np.nan, 0, np.nan, 1, np.nan, np.nan, 0, np.nan, np.nan, np.nan]
>>> df = pd.DataFrame(s, columns=["s"])
>>> df
      s
0   NaN
1   0.0
2   NaN
3   NaN
4   1.0
5   NaN
6   NaN
7   0.0
8   NaN
9   1.0
10  NaN
11  NaN
12  0.0
13  NaN
14  NaN
15  NaN
>>> 
>>> 
>>> df[df.ffill() == 0] = 0
>>> df
      s
0   NaN
1   0.0
2   0.0
3   0.0
4   1.0
5   NaN
6   NaN
7   0.0
8   0.0
9   1.0
10  NaN
11  NaN
12  0.0
13  0.0
14  0.0
15  0.0

Pandas forward fill, but only between equal values

If I understand correctly, what you want can be done like this. You want to fill the NaNs where backfill and forward fill give the same value.

ff = df.aux.ffill()
bf = df.aux.bfill()
df.aux = ff[ff == bf]

How to forward propagate/fill a specific value in a Pandas DataFrame Column/Series?

You can still use ffill but first you have to mask the False values

s.mask(~s).ffill(limit=2).fillna(s)

0     True
1     True
2     True
3    False
4    False
5     True
6     True
7     True
8    False
Name: 0, dtype: bool

Pandas dataframe fillna() only some columns in place

You can select your desired columns and do it by assignment:

df[['a', 'b']] = df[['a','b']].fillna(value=0)

The resulting output is as expected:

     a    b    c
0  1.0  4.0  NaN
1  2.0  5.0  NaN
2  3.0  0.0  7.0
3  0.0  6.0  8.0

Pandas dataframe column forward fill from first non-zero value

Use .values attribute:

df['c']=df.groupby('ID',as_index = False)['c'].apply(lambda x: x.replace(to_replace=0, method='ffill')).values

Now if you print df you will get your desired output:

    ID  b   c
0   1   0   0
1   1   5   1
2   1   8   1
3   2   4   0
4   2   8   1
5   2   81  1

Forward fill on custom value in pandas dataframe

You can use df.mask with df.isin with df.replace

df.mask(df.isin(['*']),df.replace('*',np.nan).ffill())

     a   b
0  1.0  10
1  2.0  10
2  3.0  10
3  4.0  10
4  NaN  50
5  6.0  60
6  7.0  70

Forward Fill Specific Columns in Pandas Dataframe