R replacing missing values with the mean of surroundings values

折月煮酒 提交于 2019-12-07 14:32:21

问题


My dataset looks like the following (let's call it "a"):

date value
2013-01-01 12.2
2013-01-02 NA
2013-01-03 NA
2013-01-04 16.8
2013-01-05 10.1
2013-01-06 NA
2013-01-07 12.0

I would like to replace the NA by the mean of the closest surroundings values (the previous and the next values in the series).

I tried the following but I am not convinced by the output...

miss.val=which(is.na(a$value))
library(zoo)
z=zoo(a$value,a$date)
z.corr=na.approx(z)
z.corr[(miss.val-1):(miss.val+1),]

回答1:


Using na.locf (Last Observation Carried Forward) from package zoo:

R> library("zoo")
R> x <- c(12.2, NA, NA, 16.8, 10.1, NA, 12.0)
R> (na.locf(x) + rev(na.locf(rev(x))))/2
[1] 12.20 14.50 14.50 16.80 10.10 11.05 12.00

(does not work if first or last element of x is NA)




回答2:


You can do exactly this in 1 line of code with the Moving Average na.ma function of the imputeTS package

library(imputeTS)
na_ma(yourData, k = 1)

This replaces the missing values with the mean of the closest surroundings values. You can even additionally set parameters.

na_ma(yourData, k =2, weighting = "simple")

In this case the algorithm would take the next 2 values in each direction. You can also choose different weighting of the values(you might want that values closer have more influence)



来源:https://stackoverflow.com/questions/18612715/r-replacing-missing-values-with-the-mean-of-surroundings-values

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!