I\'m trying to use the na.approx() function from the zoo library (in conjunction with xts) to interpolate missing values from repeated
The solution I've gone with is based on the first comment from @docendodiscimus
Rather than attempt to create a new data frame as I'd been doing this approach simply adds columns to the existing data frame by taking advantage of dplyr's mutate() function.
My code is now...
df %>%
group_by(variable) %>%
arrange(variable, event.date) %>%
mutate(ip.value = na.approx(value, maxgap = 4, rule = 2))
The maxgap allows upto four consecutive NA's, whilst the rule option allows extrapolation into the flanking time points.
Use the approx() function for linear-interpolation:
df %>%
group_by(variable) %>%
arrange(variable, event.date) %>%
mutate(time=seq(1,n())) %>%
mutate(ip.value=approx(time,value,time)$y) %>%
select(-time)
or the spline function for non-linear interpolation:
df %>%
group_by(variable) %>%
arrange(variable, event.date) %>%
mutate(time=seq(1,n())) %>%
mutate(ip.value=spline(time,value ,n=n())$y) %>%
select(-time)