Resampling dataframe by hours and date

不想你离开。 提交于 2019-12-08 03:58:01

问题


I have a dataframe like this:

                 Maximum Demand (KVA)  Consumption (KVAh)  Power Factor
Timestamp                                                                     
2017-04-01 01:00:00                 127.0               261.0          0.81   
2017-04-01 02:00:00                 133.0               268.0          0.79   
2017-04-01 03:00:00                 119.0               273.0          0.92   
2017-04-01 04:00:00                 118.0               263.0          0.78   
2017-04-01 05:00:00                 135.0               271.0          0.86   
2017-04-01 06:00:00                 130.0               257.0          0.82   
2017-04-01 23:00:00                 120.0               261.0          0.78   
2017-04-02 00:00:00                 121.0               272.0          0.83   
2017-04-02 01:00:00                 126.0               263.0          0.90   
2017-04-02 02:00:00                 132.0               266.0          0.83   
2017-04-02 03:00:00                 132.0               275.0          0.90   
2017-04-02 04:00:00                 122.0               259.0          0.77   
2017-04-02 05:00:00                 119.0               271.0          0.78   
2017-04-02 06:00:00                 122.0               259.0          0.81   
2017-04-02 23:00:00                 115.0               264.0          0.87   
2017-04-03 00:00:00                 129.0               273.0          0.86 

I want to resample data by the time of 01:00 - 0:00 of another date:

I tried this:

off_sum = offpeak_hist.resample('h', base=8).sum().dropna()

But the desired output is not achieved. Please help me on this.


回答1:


I think you need first shift by one hour and then resample by days:

print (offpeak_hist.shift(-1, freq='H'))
                     Maximum Demand (KVA)  Consumption (KVAh)  Power Factor
Timestamp                                                                  
2017-04-01 00:00:00                 127.0               261.0          0.81
2017-04-01 01:00:00                 133.0               268.0          0.79
2017-04-01 02:00:00                 119.0               273.0          0.92
2017-04-01 03:00:00                 118.0               263.0          0.78
2017-04-01 04:00:00                 135.0               271.0          0.86
2017-04-01 05:00:00                 130.0               257.0          0.82
2017-04-01 22:00:00                 120.0               261.0          0.78
2017-04-01 23:00:00                 121.0               272.0          0.83
2017-04-02 00:00:00                 126.0               263.0          0.90
2017-04-02 01:00:00                 132.0               266.0          0.83
2017-04-02 02:00:00                 132.0               275.0          0.90
2017-04-02 03:00:00                 122.0               259.0          0.77
2017-04-02 04:00:00                 119.0               271.0          0.78
2017-04-02 05:00:00                 122.0               259.0          0.81
2017-04-02 22:00:00                 115.0               264.0          0.87
2017-04-02 23:00:00                 129.0               273.0          0.86


df = offpeak_hist.shift(-1, freq='H').resample('D').sum().dropna()
print (df)
            Maximum Demand (KVA)  Consumption (KVAh)  Power Factor
Timestamp                                                         
2017-04-01                1003.0              2126.0          6.59
2017-04-02                 997.0              2130.0          6.72



回答2:


If I understand you correctly, you want to do this:

off_sum = df.groupby(df.index.time).sum()

to achieve this:

          Maximum Demand (KVA)  Consumption (KVAh)  Power Factor
00:00:00                 250.0               545.0          1.69
01:00:00                 253.0               524.0          1.71
02:00:00                 265.0               534.0          1.62
03:00:00                 251.0               548.0          1.82
04:00:00                 240.0               522.0          1.55
05:00:00                 254.0               542.0          1.64
06:00:00                 252.0               516.0          1.63
23:00:00                 235.0               525.0          1.65

if not, you need to update your question with desired output.



来源:https://stackoverflow.com/questions/44984707/resampling-dataframe-by-hours-and-date

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!