Changing monthly values to daily by evenly distributing between dates

℡╲_俬逩灬. 提交于 2021-02-08 08:32:10

问题


I have monthly dataset

df = pd.DataFrame({'Month':[1,2],
                   'Plan':[310,620],
                'Month_start_date': ['2020-01-01','2020-02-01']})
print(df)

df['Month_start_date'] = (pd.to_datetime(df['Month_start_date'], format='%Y/%m/%d')
                     .dt.to_period('m').dt.to_timestamp())

df = df.set_index('Month_start_date')

I created a list of dates in a format i would like to reindex

start = '2020-01-01'
end = '2020-02-29'
dates = pd.date_range(start, end, freq='D')
dates

when i try to change the dataframe to daily using this code

df_daily = df.reindex(dates, method='ffill')
print(df_daily)

This is the output i get

           Month  Plan
2020-01-01      1   310
2020-01-02      1   310
2020-01-03      1   310
2020-01-04      1   310
2020-01-05      1   310
2020-01-06      1   310
2020-01-07      1   310
2020-01-08      1   310
2020-01-09      1   310
2020-01-10      1   310
...

The list goes on till Feb 29th as expected. However plan remains same for everyday. How can i make it look like this?

            Month  Plan
2020-01-01      1   10
2020-01-02      1   10
2020-01-03      1   10
2020-01-04      1   10
2020-01-05      1   10
2020-01-06      1   10
2020-01-07      1   10
2020-01-08      1   10
2020-01-09      1   10
2020-01-10      1   10
...

2020-02-17      2   21.38
2020-02-18      2   21.38
2020-02-19      2   21.38
2020-02-20      2   21.38
2020-02-21      2   21.38
2020-02-22      2   21.38
2020-02-23      2   21.38
2020-02-24      2   21.38
2020-02-25      2   21.38
2020-02-26      2   21.38
2020-02-27      2   21.38
2020-02-28      2   21.38
2020-02-29      2   21.38

Just divide the plan between all the dates evenly by dividing it by number of days in the month. Since the Feb has 620 as its plan, every day gets 620/29 which is 21.38


回答1:


Pandas has a function for the number of days in a month:

df_daily["Daily plan"] = df_daily["Plan"] / df_daily.index.daysinmonth



回答2:


Keldorn's method is better, if you have some convenient helper function to tell you the length of each period. But here's the more general approach using groupby():

# EITHER OF THESE:
df.reindex(dates, method='ffill').groupby('Month').transform(lambda x: x/x.size)
df.reindex(dates, method='ffill').groupby('Month').transform(lambda x: x/len(x))

                Plan
2020-01-01  10.00000
2020-01-02  10.00000
...
2020-01-31  10.00000
2020-02-01  21.37931
2020-02-02  21.37931
...
2020-02-29  21.37931

and you could assign the output to df['Plan'] or df['Plan_daily'] or whatever.



来源:https://stackoverflow.com/questions/61517215/changing-monthly-values-to-daily-by-evenly-distributing-between-dates

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!