问题
Below is the sample data (out of approximately 8000 rows of data). How can I replace all NAs with values from a smoothing spline fit to the rest of the data?
Date Max Min Rain RHM RHE
4/24/1981 35.9 24.7 0.0 71 37
4/25/1981 36.8 22.8 0.0 62 40
4/26/1981 36.0 22.6 0.0 47 37
4/27/1981 35.1 24.2 0.0 51 39
4/28/1981 35.4 23.8 0.0 61 47
4/29/1981 35.4 25.1 0.0 67 43
4/30/1981 37.4 24.8 0.0 72 34
5/1/1981 NA NA NA NA NA
5/2/1981 39.0 25.3 NA NA 55
5/3/1981 35.9 23.0 0.0 68 66
5/4/1981 28.4 22.4 0.7 70 30
5/5/1981 35.5 24.6 0.0 47 31
5/6/1981 37.4 25.5 0.0 51 31
回答1:
I'm using some simplified data for the purposes of answering this query. Take this dataset:
dat <- structure(list(x = c(1.6, 1.6, 4.4, 4.5, 6.1, 6.7, 7.3, 8, 9.5,
9.5, 10.7), y = c(2.2, 4.5, 1.6, 4.3, NA, NA, 4.8, 7.3, 8.7, 6.3, 12.3)),
.Names = c("x", "y"), row.names = c(NA, -11L), class = "data.frame")
Which looks like the below when plotted using plot(dat,type="o",pch=19)
:

Now fit a smoothing spline to the data without the NA
values
smoo <- with(dat[!is.na(dat$y),],smooth.spline(x,y))
And then predict the y
values for x
, where y
is currently NA
result <- with(dat,predict(smoo,x[is.na(y)]))
points(result,pch=19,col="red")

To fill the values back into the original data you can then do:
dat[is.na(dat$y),] <- result
回答2:
One thing to check out might be the na.spline
function in the zoo
package. It appears custom designed for this purpose.
Missing values (NAs) are replaced by linear interpolation via approx or cubic spline interpolation via spline, respectively.
来源:https://stackoverflow.com/questions/18695335/replacing-all-nas-with-smoothing-spline