Grouping and Summing Data by Irregular Time Intervals (R language)

霸气de小男生 提交于 2021-01-01 06:44:14

问题


I am looking at a stackoverflow post over here: R: Count Number of Observations within a group

Here, daily data is created and summed/grouped at monthly intervals (as well as weekly intervals):

library(xts)
library(dplyr)

#create data

date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")

date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")

property_damages_in_dollars <- rnorm(731,100,10)

final_data <- data.frame(date_decision_made, property_damages_in_dollars)


# weekly

weekly = final_data %>%
    mutate(date_decision_made = as.Date(date_decision_made)) %>%
    group_by(week = format(date_decision_made, "%W-%y")) %>%
    summarise( total = sum(property_damages_in_dollars, na.rm = TRUE), Count = n())


# monthly 

final_data %>%
    mutate(date_decision_made = as.Date(date_decision_made)) %>%
    group_by(week = format(date_decision_made, "%Y-%m")) %>%
    summarise( total = sum(property_damages_in_dollars, na.rm = TRUE), Count = n())

It seems that the "format" statement in R (https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/format) is being used to instruct the computer to "group and sum" the data some fixed interval.

My question: is there a way to "instruct" the computer to "group and sum" by irregular intervals? E.g. by 11 day periods, by 3 month periods, by 2 year periods? (I guess 3 months can be written as 90 days...2 years can be written as 730 days).

Is this possible?

Thanks


回答1:


You can use lubridate's ceiling_date/floor_date to create groups at irregular intervals.

library(dplyr)
library(lubridate)

final_data %>%
  mutate(date_decision_made = as.Date(date_decision_made)) %>%
  group_by(group = ceiling_date(date_decision_made, '11 days')) %>%
  summarise(amount = sum(property_damages_in_dollars))

You can also specify intervals like ceiling_date(date_decision_made, '3 years') or ceiling_date(date_decision_made, '2 months').




回答2:


Using data.table

library(data.table)
library(lubridate)
setDT(final_data)[,  .(amount = sum(property_damages_in_dollars)),
      ,.(group = ceiling_date(as.IDate(date_decison_made), "11 days"))]


来源:https://stackoverflow.com/questions/65367282/grouping-and-summing-data-by-irregular-time-intervals-r-language

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!