Add missing dates to pandas dataframe

前端未结

关注

 5  2185

春和景丽 2020-11-22 09:47

My data can have multiple events on a given date or NO events on a date. I take these events, get a count by date and plot them. However, when I plot them, my two series do

5条回答

挽巷 (楼主)

2020-11-22 10:34
An alternative approach is resample, which can handle duplicate dates in addition to missing dates. For example:
```
df.resample('D').mean()
```
resample is a deferred operation like groupby so you need to follow it with another operation. In this case mean works well, but you can also use many other pandas methods like max, sum, etc.

Here is the original data, but with an extra entry for '2013-09-03':
```
             val
date           
2013-09-02     2
2013-09-03    10
2013-09-03    20    <- duplicate date added to OP's data
2013-09-06     5
2013-09-07     1
```
And here are the results:
```
             val
date            
2013-09-02   2.0
2013-09-03  15.0    <- mean of original values for 2013-09-03
2013-09-04   NaN    <- NaN b/c date not present in orig
2013-09-05   NaN    <- NaN b/c date not present in orig
2013-09-06   5.0
2013-09-07   1.0
```
I left the missing dates as NaNs to make it clear how this works, but you can add fillna(0) to replace NaNs with zeroes as requested by the OP or alternatively use something like interpolate() to fill with non-zero values based on the neighboring rows.
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...