Adding Missing Dates to Dataframe

Add missing dates to pandas dataframe

You could use Series.reindex:

import pandas as pd

idx = pd.date_range('09-01-2013', '09-30-2013')

s = pd.Series({'09-02-2013': 2,
               '09-03-2013': 10,
               '09-06-2013': 5,
               '09-07-2013': 1})
s.index = pd.DatetimeIndex(s.index)

s = s.reindex(idx, fill_value=0)
print(s)

yields

2013-09-01     0
2013-09-02     2
2013-09-03    10
2013-09-04     0
2013-09-05     0
2013-09-06     5
2013-09-07     1
2013-09-08     0
...

Fill missing dates in a pandas DataFrame

You could create a date range and use "Fecha" column to set_index + reindex to add missing months. Then fillna + reset_index fetches the desired outcome:

df['Fecha'] = pd.to_datetime(df['Fecha'])
df = (df.set_index('Fecha')
      .reindex(pd.date_range('2020-01-01', '2021-12-01', freq='MS'))
      .rename_axis(['Fecha'])
      .fillna(0)
      .reset_index())

Output:

        Fecha  unidades
0  2020-01-01       2.0
1  2020-02-01       0.0
2  2020-03-01       0.0
3  2020-04-01       0.0
4  2020-05-01       0.0
5  2020-06-01       0.0
6  2020-07-01       0.0
7  2020-08-01       0.0
8  2020-09-01       4.0
9  2020-10-01      11.0
10 2020-11-01       4.0
11 2020-12-01       2.0
12 2021-01-01       0.0
13 2021-02-01       0.0
14 2021-03-01       9.0
15 2021-04-01       2.0
16 2021-05-01       1.0
17 2021-06-01       0.0
18 2021-07-01       1.0
19 2021-08-01       0.0
20 2021-09-01       0.0
21 2021-10-01       0.0
22 2021-11-01       0.0
23 2021-12-01       0.0

Add missing dates do datetime column in Pandas using last value

try this:

# If your date format is dayfirst, then use the following code

df['date (dd/mm/yyyy)'] = pd.to_datetime(df['date (dd/mm/yyyy)'], dayfirst=True)
out = df.set_index('date (dd/mm/yyyy)').asfreq('D', method='ffill').reset_index()
print(out)

Fill in missing dates for a pandas dataframe with multiple series

Group by Item and Category, then generate a time series from the min to the max date:

result = (
    df.groupby(["Item", "Category"])["Date"]
    .apply(lambda s: pd.date_range(s.min(), s.max()))
    .explode()
    .reset_index()
)

Including the missing dates in the date column of pandas dataframe for a specific timespan

Set Date as index and reindex it with df_date:

df_date = pd.date_range(start='1/1/2019', end='11/1/2020', freq='MS')
df = df.set_index('Date').reindex(df_date)

Output:

>>> df
                 Value
2019-01-01         NaN
2019-02-01         NaN
2019-03-01         NaN
2019-04-01         NaN
2019-05-01         NaN
2019-06-01         NaN
2019-07-01         NaN
2019-08-01         NaN
2019-09-01         NaN
2019-10-01  46486868.0
2019-11-01  36092742.0
2019-12-01  32839185.0
2020-01-01         NaN
2020-02-01         NaN
2020-03-01         NaN
2020-04-01         NaN
2020-05-01         NaN
2020-06-01         NaN
2020-07-01         NaN
2020-08-01         NaN
2020-09-01         NaN
2020-10-01         NaN
2020-11-01         NaN

Dataframe: Add new rows for missing dates

You can use .reindex + .ffill():

min_date = df.index.min()
max_date = df.index.max()
date_list = pd.date_range(min_date, max_date, freq="D")

df = df.reindex(date_list).ffill()
print(df)

Prints:

              S&P500    Europe     Japan
2002-12-23  0.247683  0.245252  0.203916
2002-12-24  0.241855  0.237858  0.200971
2002-12-25  0.241855  0.237858  0.200971
2002-12-26  0.237095  0.230614  0.197621
2002-12-27  0.241104  0.250323  0.191855

OR: Use method= parameter

df = df.reindex(date_list, method="ffill")

Add missing dates to pandas dataframe with zeros as values

In your first approach, you are reindexing a DatetimeIndex with a PeriodIndex(created by period_range), use date_range instead of period_range works:

idx = pd.date_range(date_period, date_now)
df.index = pd.DatetimeIndex(df.date)

df.reindex(idx, fill_value=0)
#                  date  quantity
#2022-08-13           0         0
#2022-08-14           0         0
#2022-08-15           0         0
#2022-08-16           0         0
#2022-08-17  2022-08-17         1
#2022-08-18  2022-08-18         2
#2022-08-19  2022-08-19         3
#2022-08-20           0         0

How to add missing dates in pandas

Use DataFrame.reindex, working also if need some custom start and end datimes:

df = df.reindex(pd.date_range(start, end, freq ='D'))

Or DataFrame.asfreq for add missing datetimes between existing data:

df = df.asfreq('d')

Filling missing dates on a DataFrame across different groups

Let's try it with pivot + date_range + reindex + stack:

tmp = df.pivot('date','customer','attended')
tmp.index = pd.to_datetime(tmp.index)
out = tmp.reindex(pd.date_range(tmp.index[0], tmp.index[-1])).fillna(False).stack().reset_index().rename(columns={0:'attended'})

Output:

     level_0 customer  attended
0 2022-01-01     John      True
1 2022-01-01     Mark     False
2 2022-01-02     John      True
3 2022-01-02     Mark     False
4 2022-01-03     John     False
5 2022-01-03     Mark     False
6 2022-01-04     John      True
7 2022-01-04     Mark     False
8 2022-01-05     John     False
9 2022-01-05     Mark      True

pandas fill missing dates in time series

You need to use period_range rather than date_range:

In [11]: idx = pd.period_range(min(df.date), max(df.date))
    ...: results.reindex(idx, fill_value=0)
    ...:
Out[11]:
                  f1        f2        f3        f4
2000-01-01  2.049157  1.962635  2.756154  2.224751
2000-01-02  2.675899  2.587217  1.540823  1.606150
2000-01-03  0.000000  0.000000  0.000000  0.000000
2000-01-04  0.000000  0.000000  0.000000  0.000000
2000-01-05  0.000000  0.000000  0.000000  0.000000
2000-01-06  0.000000  0.000000  0.000000  0.000000
2000-01-07  0.000000  0.000000  0.000000  0.000000
2000-01-08  0.000000  0.000000  0.000000  0.000000
2000-01-09  0.000000  0.000000  0.000000  0.000000
2000-01-10  0.000000  0.000000  0.000000  0.000000
2000-01-11  0.000000  0.000000  0.000000  0.000000
2000-01-12  0.000000  0.000000  0.000000  0.000000
2000-01-13  0.000000  0.000000  0.000000  0.000000
2000-01-14  0.000000  0.000000  0.000000  0.000000
2000-01-15  0.000000  0.000000  0.000000  0.000000
2000-01-16  0.000000  0.000000  0.000000  0.000000
2000-01-17  0.000000  0.000000  0.000000  0.000000
2000-01-18  0.000000  0.000000  0.000000  0.000000
2000-01-19  0.000000  0.000000  0.000000  0.000000
2000-01-20  0.000000  0.000000  0.000000  0.000000
2000-01-21  0.000000  0.000000  0.000000  0.000000
2000-01-22  0.000000  0.000000  0.000000  0.000000
2000-01-23  0.000000  0.000000  0.000000  0.000000
2000-01-24  0.000000  0.000000  0.000000  0.000000
2000-01-25  0.000000  0.000000  0.000000  0.000000
2000-01-26  0.000000  0.000000  0.000000  0.000000
2000-01-27  0.000000  0.000000  0.000000  0.000000
2000-01-28  0.000000  0.000000  0.000000  0.000000
2000-01-29  0.000000  0.000000  0.000000  0.000000
2000-01-30  0.000000  0.000000  0.000000  0.000000
2000-01-31  0.000000  0.000000  0.000000  0.000000
2000-02-01  0.000000  0.000000  0.000000  0.000000
2000-02-02  0.000000  0.000000  0.000000  0.000000
2000-02-03  0.000000  0.000000  0.000000  0.000000
2000-02-04  1.856158  2.892620  2.986166  2.793448

This is because your groupby uses PeriodIndex, rather than datetime:

df.groupby(pd.PeriodIndex(data=df.date, freq='D'))

You could have instead used a pd.Grouper:

df.groupby(pd.Grouper(key="date", freq='D'))

which would have give a datetime index.

Adding Missing Dates to Dataframe

Add missing dates to pandas dataframe

Fill missing dates in a pandas DataFrame

Add missing dates do datetime column in Pandas using last value

Fill in missing dates for a pandas dataframe with multiple series

Including the missing dates in the date column of pandas dataframe for a specific timespan

Dataframe: Add new rows for missing dates

Add missing dates to pandas dataframe with zeros as values

How to add missing dates in pandas

Filling missing dates on a DataFrame across different groups

pandas fill missing dates in time series

Related Topics

Leave a reply