How do I prevent interpolation between values where there are more than 2 missing rows of data?

江枫思渺然 提交于 2019-12-08 07:54:55

问题


I would like to write a conditional statement inside mutate_at() so that approx() does not interpolate between values where there are more than 2 missing rows of data.

Here are the data:

dat <- data.frame(
  time = 1:10, 
  var1 = c(10, 10, 10, 12, 12, 12, 15, 15, 15, 15), 
  var2 = c( 1, NA,  3,  6, NA, NA, NA, 10,  9,  8), 
  var3 = c(10, NA, NA, 13, 14, 16, NA, 18, 19, 20)
)

The is the chunk of code I would like to adapt such that it does NOT interpolate where there are more than 2 NAs between values (i.e., rows 5-7 in the var2 column should remain NA and all other NAs should be interpolated values.

library(tidyverse)

dat_int <- dat %>%
  mutate_at(vars(c(var2, var3)),
            funs(approx(time, ., time, rule = 1, method = "linear")[["y"]]))

回答1:


Step 1: Create a function, consecutiveNA, that can identify the consecutive NA in a vector based on a threshold (specified by the argument len).

consecutiveNA <- function(x, len = 2){
  rl <- rle(is.na(x))
  logi <- rl$lengths >= len & rl$values
  rl$values <- logi
  inver <- inverse.rle(rl)
  return(inver)
}

Step 2: Apply the approx function to target columns (as you did).

library(tidyverse)

dat_int <- dat %>%
  mutate_at(vars(c(var2, var3)),
            funs(approx(time, ., time, rule = 1, method = "linear")[["y"]]))

Step 3: Apply the consecutiveNA function to all columns in dat and convert the result to a matrix.

m_NA <- map(dat, consecutiveNA, len = 2) %>%
  as.data.frame() %>%
  as.matrix()

Step 4: Based on m_NA to replace those TRUE with NA in dat_int, and then the work is done. You can change len to 3 or other numbers to see if it works.

dat_int[m_NA] <- NA

dat_int
#    time var1 var2 var3
# 1     1   10    1   10
# 2     2   10    2   NA
# 3     3   10    3   NA
# 4     4   12    6   13
# 5     5   12   NA   14
# 6     6   12   NA   16
# 7     7   15   NA   17
# 8     8   15   10   18
# 9     9   15    9   19
# 10   10   15    8   20


来源:https://stackoverflow.com/questions/55599319/how-do-i-prevent-interpolation-between-values-where-there-are-more-than-2-missin

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!