I have a dataframe which has aggregated data for some days. I want to add in the missing days
I was following another post, Add missing dates to pandas dataframe, u
From cᴏʟᴅsᴘᴇᴇᴅ's hints in the comments:
resample fits well here.
Resample: Convenience method for frequency conversion and resampling of time series. Object must have a datetime-like index (DatetimeIndex, PeriodIndex, or TimedeltaIndex), or pass datetime-like values to the on or level keyword.
import random
import datetime as dt
import numpy as np
import pandas as pd
def generate_row(year, month, day):
while True:
date = dt.datetime(year=year, month=month, day=day)
data = np.random.random(size=4)
yield [date] + list(data)
# days I have data for
dates = [(2000, 1, 1), (2000, 1, 2), (2000, 2, 4)]
generators = [generate_row(*date) for date in dates]
# get 5 points for each
data = [next(generator) for generator in generators for _ in range(5)]
# make dataframe
df = pd.DataFrame(data, columns=['date'] + ['f'+str(i) for i in range(1,5)])
# using the resample method
df.set_index(df.date, inplace=True)
df = df.resample('D').sum().fillna(0)