Pandas Interpolate Within a Groupby

Pandas interpolate within a groupby

>>> df.groupby('filename').apply(lambda group: group.interpolate(method='index'))
    filename  val1  val2
t                       
1  file1.csv     5    10
2  file1.csv    10    15
3  file1.csv    15    20
6  file2.csv   NaN   NaN
7  file2.csv    10    20
8  file2.csv    12    15

How to interpolate missing values with groupby?

And base on what you need , pass the method spline

df.groupby('state')['population'].apply(lambda x : x.interpolate(method = "spline", order = 1, limit_direction = "both"))
0    100.0
1    150.0
2    200.0
3    250.0
4     50.0
5    125.0
6    200.0
7    275.0
Name: population, dtype: float64

Pandas dataframe groupby id and interpolate values

I believe you need DataFrame.groupby with DataFrame.resample and Resampler.interpolate:

#for DatetimeIndex
df.index = pd.to_datetime(df['year'], format='%Y').rename('datetimes')

df = (df.groupby('id')['value']
        .apply(lambda x: x.resample('MS').interpolate())
        .reset_index())
print (df)
     id  datetimes     value
0     1 2020-01-01  0.090000
1     1 2020-02-01  0.090083
2     1 2020-03-01  0.090167
3     1 2020-04-01  0.090250
4     1 2020-05-01  0.090333
..   ..        ...       ...
477   2 2039-09-01  0.109667
478   2 2039-10-01  0.109750
479   2 2039-11-01  0.109833
480   2 2039-12-01  0.109917
481   2 2040-01-01  0.110000

[482 rows x 3 columns]

interpolate annual data for each group separately

Thanks to Henry Ecker, he answered my question in the comment of my previous post.

df['observations'] = (
    df['observations']
        .mask(df['observations'].eq(0))  # Replace 0 with NaN
        .groupby(df['station'])  # Groupby Station
        .transform(pd.Series.interpolate, method='linear')  # interpolate
)

He also suggested this post too for more information.

Groupby and interpolate in Pandas

I hope your function should be chained with groupby object like:

df = (df.set_index('year week')
        .groupby('Account Id')[cols_to_interpolate]
        .resample('D')
        .ffill()
        .interpolate() / 7)

Solution from comments is different - interpolate is apply for each group:

df.groupby('Account Id').apply(interpolator)