linear interpolate missing values in time series

前端 未结 4 978
悲哀的现实
悲哀的现实 2020-12-09 21:01

I would like to add all missing dates between min and max date in a data.frame and linear interpolate all missing values, like

df <- data.fra         


        
4条回答
  •  时光取名叫无心
    2020-12-09 21:06

    Here are a few solutions.

    1) zoo Convert the data frame to a zoo series and use na.approx with an xout= of sequential dates to get the final series

    library(zoo)
    z <- read.zoo(mydf)
    zz <- na.approx(z, xout = seq(start(z), end(z), "day"))
    

    giving:

    > zz
    2015-10-05 2015-10-06 2015-10-07 2015-10-08 2015-10-09 2015-10-10 2015-10-11 
      8.000000   6.333333   4.666667   3.000000   9.000000   8.200000   7.400000 
    2015-10-12 2015-10-13 2015-10-14 
      6.600000   5.800000   5.000000 
    

    It may be more convenient to leave it in zoo form so you can use all the facilities of zoo but if you need it in data frame form just use

    DF <- fortify.zoo(zz)
    

    1a) zoo/magrittr The above could alternately be expressed as a magrittr pipeline:

    library(magrittr)
    df %>% read.zoo %>% na.approx(xout = seq(start(.), end(.), "day")) %>% fortify.zoo
    

    (or omit the fortify.zoo part if you want zoo output).

    2) base R We can essentially do the same thing without packages like this:

    n <- nrow(mydf)
    with(mydf, data.frame(approx(date, value, xout = seq(date[1], date[n], "day"))))
    

提交回复
热议问题