问题
I have a dataframe like this:
Maximum Demand (KVA) Consumption (KVAh) Power Factor
Timestamp
2017-04-01 01:00:00 127.0 261.0 0.81
2017-04-01 02:00:00 133.0 268.0 0.79
2017-04-01 03:00:00 119.0 273.0 0.92
2017-04-01 04:00:00 118.0 263.0 0.78
2017-04-01 05:00:00 135.0 271.0 0.86
2017-04-01 06:00:00 130.0 257.0 0.82
2017-04-01 23:00:00 120.0 261.0 0.78
2017-04-02 00:00:00 121.0 272.0 0.83
2017-04-02 01:00:00 126.0 263.0 0.90
2017-04-02 02:00:00 132.0 266.0 0.83
2017-04-02 03:00:00 132.0 275.0 0.90
2017-04-02 04:00:00 122.0 259.0 0.77
2017-04-02 05:00:00 119.0 271.0 0.78
2017-04-02 06:00:00 122.0 259.0 0.81
2017-04-02 23:00:00 115.0 264.0 0.87
2017-04-03 00:00:00 129.0 273.0 0.86
I want to resample data by the time of 01:00 - 0:00 of another date:
I tried this:
off_sum = offpeak_hist.resample('h', base=8).sum().dropna()
But the desired output is not achieved. Please help me on this.
回答1:
I think you need first shift by one hour and then resample by days
:
print (offpeak_hist.shift(-1, freq='H'))
Maximum Demand (KVA) Consumption (KVAh) Power Factor
Timestamp
2017-04-01 00:00:00 127.0 261.0 0.81
2017-04-01 01:00:00 133.0 268.0 0.79
2017-04-01 02:00:00 119.0 273.0 0.92
2017-04-01 03:00:00 118.0 263.0 0.78
2017-04-01 04:00:00 135.0 271.0 0.86
2017-04-01 05:00:00 130.0 257.0 0.82
2017-04-01 22:00:00 120.0 261.0 0.78
2017-04-01 23:00:00 121.0 272.0 0.83
2017-04-02 00:00:00 126.0 263.0 0.90
2017-04-02 01:00:00 132.0 266.0 0.83
2017-04-02 02:00:00 132.0 275.0 0.90
2017-04-02 03:00:00 122.0 259.0 0.77
2017-04-02 04:00:00 119.0 271.0 0.78
2017-04-02 05:00:00 122.0 259.0 0.81
2017-04-02 22:00:00 115.0 264.0 0.87
2017-04-02 23:00:00 129.0 273.0 0.86
df = offpeak_hist.shift(-1, freq='H').resample('D').sum().dropna()
print (df)
Maximum Demand (KVA) Consumption (KVAh) Power Factor
Timestamp
2017-04-01 1003.0 2126.0 6.59
2017-04-02 997.0 2130.0 6.72
回答2:
If I understand you correctly, you want to do this:
off_sum = df.groupby(df.index.time).sum()
to achieve this:
Maximum Demand (KVA) Consumption (KVAh) Power Factor
00:00:00 250.0 545.0 1.69
01:00:00 253.0 524.0 1.71
02:00:00 265.0 534.0 1.62
03:00:00 251.0 548.0 1.82
04:00:00 240.0 522.0 1.55
05:00:00 254.0 542.0 1.64
06:00:00 252.0 516.0 1.63
23:00:00 235.0 525.0 1.65
if not, you need to update your question with desired output.
来源:https://stackoverflow.com/questions/44984707/resampling-dataframe-by-hours-and-date