Strip out months from two date columns

為{幸葍}努か 提交于 2021-02-05 07:46:16

问题


I have a pandas dataframe that has contracts start and end date and a quantity. How would I strip out the individual months so they can be aggregated and graphed.

ex 
Start Date  End Date       Demanded     Customer
1/1/2017    3/31/2017        100            A
2/1/2017    3/31/2017         50            B

strip out the months to the following

Month       Demand    Customer
1/1/2017     100      A
2/1/2017     100      A
3/1/2017     100      A
2/1/2017      50      B
3/1/2017      50      B

End result is to pivot this and then graph with months on the x-axis and total demand on the y-axis


回答1:


You can first convert columns with dates to_datetime. Then use itertuples and date_range with frequency MS (start of month) with concat for creating new expanding DataFrame. Last join original columns Quantity Demanded and Customer:

df['Start_Date'] = pd.to_datetime(df['Start Date'])
df['End_Date'] = pd.to_datetime(df['End Date'])

df1 = pd.concat([pd.Series(r.Index, 
                           pd.date_range(r.Start_Date, r.End_Date, freq='MS')) 
                           for r in df.itertuples()])
        .reset_index()
df1.columns = ['Month','idx']
print (df1)
       Month  idx
0 2017-01-01    0
1 2017-02-01    0
2 2017-03-01    0
3 2017-02-01    1
4 2017-03-01    1

df2 = df1.set_index('idx').join(df[['Quantity Demanded','Customer']]).reset_index(drop=True)
print (df2)
       Month  Quantity Demanded Customer
0 2017-01-01                100        A
1 2017-02-01                100        A
2 2017-03-01                100        A
3 2017-02-01                 50        B
4 2017-03-01                 50        B



回答2:


Using melt then resample('MS')

df['Start Date'] = pd.to_datetime(df['Start Date'])
df['End Date'] = pd.to_datetime(df['End Date'])

d1 = pd.melt(
    df, ['Demanded', 'Customer'],
    ['Start Date', 'End Date'],
    value_name='Date'
).drop('variable', 1).set_index('Date')

d1.groupby('Customer').apply(lambda df: df.resample('MS').ffill()) \
    .reset_index(0, drop=True) \
    .reset_index()

        Date  Demanded Customer
0 2017-01-01       100        A
1 2017-02-01       100        A
2 2017-03-01       100        A
3 2017-02-01        50        B
4 2017-03-01        50        B


来源:https://stackoverflow.com/questions/41902835/strip-out-months-from-two-date-columns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!