Conditional rolling mean (moving average) on irregular time series

懵懂的女人 提交于 2019-12-17 18:41:33

问题


I have a group of data in the format:

ID    Minutes Value
xxxx  118     3 
xxxx  121     4 
xxxx  122     3 
yyyy  122     6 
xxxx  123     4 
yyyy  123     8 
...   ...     .... 

Each ID is a patient and each value is, say, blood pressure for that minute. I would like to create a rolling average for the 60 minutes before and 60 minutes after each point. However - as you can see, there are missing minutes (so I cannot merely use row numbers) and I would like to create average for each unique ID (so the average for ID xxxx cannot include values assigned to ID yyyy). It sounds like rollapply or rollingstat might be options, but have had little success trying to piece this together...

Please let me know if further clarity is needed.


回答1:


You can easily fill in the missing Minutes (Value will be set to NA), then use rollapply

library(data.table)
library(zoo)

## Convert to data.table
DT <- data.table(DF, key=c("IDs", "Minutes"))

## Missing Minutes will be added in. Value will be set to NA. 
DT <- DT[CJ(unique(IDs), seq(min(Minutes), max(Minutes)))]

## Run your function
DT[, rollapply(value, 60, mean, na.rm=TRUE), by=IDs]

Alternatively, you don't need to keep the 'padded' Minutes / NA Values:

You can do it all in one shot:

## Convert your DF to a data.able
DT <- data.table(DF, key=c("IDs", "Minutes"))

## Compute rolling means, with on-the-fly padded minutes
DT[ CJ(unique(IDs), seq(min(Minutes), max(Minutes))) ][, 
  rollapply(value, 60, mean, na.rm=TRUE), by=IDs]



回答2:


An alternative approach that uses tidyr/dplyr instead of data.table and RcppRoll instead of zoo:

library(dplyr)
library(tidyr)
library(RcppRoll)

d %>% 
  group_by(ID) %>%
  # add rows for unosberved minutes
  complete(Minutes = full_seq(Minutes, 1)) %>%
  # RcppRoll::roll_mean() is written in C++ for speed 
  mutate(moving_mean = roll_mean(Value, 131, fill = NA, na.rm = TRUE)) %>%
  # keep only the rows that were originally observed
  filter(!is.na(Value))

data

d <- data_frame(
  ID = rep(1:3, each = 5),
  Minutes = rep(c(1, 30, 60, 120, 200), 3),
  Value = rpois(15, lambda = 10)
)


来源:https://stackoverflow.com/questions/21372735/conditional-rolling-mean-moving-average-on-irregular-time-series

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!