I have a data frame I read from a csv file that has daily observations:
Date Value
2010-01-04 23.4
2010-01-05 12.7
2010-01-04 20.1
2010-01-07 18
One option is to expand your date index to include the missing observations, and use na.approx
from zoo
to fill in the missing values via interpolation.
allDates <- seq.Date(
min(values$Date),
max(values$Date),
"day")
##
allValues <- merge(
x=data.frame(Date=allDates),
y=values,
all.x=TRUE)
R> head(allValues,7)
Date Value
1 2010-01-05 -0.6041787
2 2010-01-06 0.2274668
3 2010-01-07 -1.2751761
4 2010-01-08 -0.8696818
5 2010-01-09 NA
6 2010-01-10 NA
7 2010-01-11 -0.3486378
##
zooValues <- zoo(allValues$Value,allValues$Date)
R> head(zooValues,7)
2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 2010-01-11
-0.6041787 0.2274668 -1.2751761 -0.8696818 NA NA -0.3486378
##
approxValues <- na.approx(zooValues)
R> head(approxValues,7)
2010-01-05 2010-01-06 2010-01-07 2010-01-08 2010-01-09 2010-01-10 2010-01-11
-0.6041787 0.2274668 -1.2751761 -0.8696818 -0.6960005 -0.5223192 -0.3486378
Even with missing values, zooValues
is still a legitimate zoo
object, e.g. plot(zooValues)
will work (with discontinuities at missing values), but if you plan on fitting some sort of model to the data, you will most likely be better off using na.approx
to replace the missing values.
Data:
library(zoo)
library(lubridate)
##
t0 <- "2010-01-04"
Dates <- as.Date(ymd(t0))+1:120
weekDays <- Dates[!(weekdays(Dates) %in% c("Saturday","Sunday"))]
##
set.seed(123)
values <- data.frame(Date=weekDays,Value=rnorm(length(weekDays)))