Pandas fill in missing date within each group with information in the previous row

Getting the date right of course:

x.dt = pd.to_datetime(x.dt)

Then this:

cols = ['dt', 'sub_id']

for _, d in x.drop_duplicates(cols, keep='last')

dt amount sub_id
0 2016-01-01 10 1
1 2016-01-02 10 1
2 2016-01-03 30 1
3 2016-01-04 40 1
4 2016-01-01 80 2
5 2016-01-02 80 2
6 2016-01-03 80 2
7 2016-01-04 82 2

Fill in missing date values and populate second column based on previous row

First you need to make sure your date is datetime type, and you can use resample:

# resample
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)

new_df = df.set_index('Date').resample('D').ffill().reset_index()


         Date  Rate
0 2019-01-01 1.12
1 2019-01-02 1.13
2 2019-01-03 1.12
3 2019-01-04 1.12
4 2019-01-05 1.12
5 2019-01-06 1.11
6 2019-01-07 1.13
7 2019-01-08 1.14
8 2019-01-09 1.13
9 2019-01-10 1.11
10 2019-01-11 1.11
11 2019-01-12 1.12
12 2019-01-13 1.13
13 2019-01-14 1.14

Pandas filling missing dates and values within group

Initial Dataframe:

            dt  user    val
0 2016-01-01 a 1
1 2016-01-02 a 33
2 2016-01-05 b 2
3 2016-01-06 b 1

First, convert the dates to datetime:

x['dt'] = pd.to_datetime(x['dt'])

Then, generate the dates and unique users:

dates = x.set_index('dt').resample('D').asfreq().index

>> DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
'2016-01-05', '2016-01-06'],
dtype='datetime64[ns]', name='dt', freq='D')

users = x['user'].unique()

>> array(['a', 'b'], dtype=object)

This will allow you to create a MultiIndex:

idx = pd.MultiIndex.from_product((dates, users), names=['dt', 'user'])

>> MultiIndex(levels=[[2016-01-01 00:00:00, 2016-01-02 00:00:00, 2016-01-03 00:00:00, 2016-01-04 00:00:00, 2016-01-05 00:00:00, 2016-01-06 00:00:00], ['a', 'b']],
labels=[[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5], [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]],
names=['dt', 'user'])

You can use that to reindex your DataFrame:

x.set_index(['dt', 'user']).reindex(idx, fill_value=0).reset_index()
dt user val
0 2016-01-01 a 1
1 2016-01-01 b 0
2 2016-01-02 a 33
3 2016-01-02 b 0
4 2016-01-03 a 0
5 2016-01-03 b 0
6 2016-01-04 a 0
7 2016-01-04 b 0
8 2016-01-05 a 0
9 2016-01-05 b 2
10 2016-01-06 a 0
11 2016-01-06 b 1

which then can be sorted by users:

x.set_index(['dt', 'user']).reindex(idx, fill_value=0).reset_index().sort_values(by='user')
dt user val
0 2016-01-01 a 1
2 2016-01-02 a 33
4 2016-01-03 a 0
6 2016-01-04 a 0
8 2016-01-05 a 0
10 2016-01-06 a 0
1 2016-01-01 b 0
3 2016-01-02 b 0
5 2016-01-03 b 0
7 2016-01-04 b 0
9 2016-01-05 b 2
11 2016-01-06 b 1

Pandas - Filling missing dates within groups with different time ranges

Create DatetimeIndex, so possible use groupby with custom lambda function and Series.asfreq:

x['dt'] = pd.to_datetime(x['dt'])
x = (x.set_index('dt')
.apply(lambda x: x.asfreq('MS', fill_value=0))
print (x)
user dt val
0 a 2015-01-01 1
1 a 2015-02-01 33
2 a 2015-03-01 0
3 a 2015-04-01 0
4 a 2015-05-01 4
5 a 2015-06-01 0
6 a 2015-07-01 2
7 a 2015-08-01 66
8 b 2016-01-01 2
9 b 2016-02-01 1
10 b 2016-03-01 0
11 b 2016-04-01 0
12 b 2016-05-01 5
13 b 2016-06-01 0
14 b 2016-07-01 0
15 b 2016-08-01 0
16 b 2016-09-01 1
17 c 2017-01-01 5
18 c 2017-02-01 0
19 c 2017-03-01 7
20 c 2017-04-01 0
21 c 2017-05-01 0
22 c 2017-06-01 0
23 c 2017-07-01 0
24 c 2017-08-01 5

Or use Series.reindex with min and max datetimes per groups:

x = (x.set_index('dt')
.apply(lambda x: x.reindex(pd.date_range(x.index.min(),
x.index.max(), freq='MS'), fill_value=0))

Fill in missing pandas data with previous non-missing value, grouped by key

You could perform a groupby/forward-fill operation on each group:

import numpy as np
import pandas as pd

df = pd.DataFrame({'id': [1,1,2,2,1,2,1,1], 'x':[10,20,100,200,np.nan,np.nan,300,np.nan]})
df['x'] = df.groupby(['id'])['x'].ffill()


   id      x
0 1 10.0
1 1 20.0
2 2 100.0
3 2 200.0
4 1 20.0
5 2 200.0
6 1 300.0
7 1 300.0

Fill missing date record by duplication former date record in pandas

Or you can simply using reindex

idx=pd.date_range(start='2015-02-20',end='2015-10-23', freq='D')

Date Value
0 2015-10-23 75%
1 2015-10-22 50%
2 2015-10-21 50%
3 2015-10-20 50%
4 2015-10-19 50%
5 2015-10-18 50%
6 2015-10-17 50%
7 2015-10-16 50%
8 2015-10-15 50%
9 2015-10-14 50%
10 2015-10-13 50%
11 2015-10-12 50%
12 2015-10-11 50%
13 2015-10-10 50%
14 2015-10-09 50%
15 2015-10-08 50%
16 2015-10-07 50%
17 2015-10-06 50%
18 2015-10-05 50%
19 2015-10-04 50%
20 2015-10-03 50%
21 2015-10-02 50%
22 2015-10-01 50%
23 2015-09-30 50%
24 2015-09-29 50%
25 2015-09-28 50%
26 2015-09-27 50%
27 2015-09-26 50%
28 2015-09-25 50%
29 2015-09-24 50%

