Identify NA's in sequence row-wise

故事扮演 提交于 2019-12-19 09:49:50

问题


I want to fill NA values in a sequence, which is row-wise, based on a condition. Please see example below.

ID | Observation 1 | Observation 2 | Observation 3 | Observation 4 | Observation 5
 A         NA              0               1             NA             NA

The condition is:

  • all NA values before !NA values in the sequence should be left as NA;
  • but all NAs after !NA values in the sequence should be tagged ("remove")

In the example above, NA value in Observation 1 should remain NA. However, the NA values in Observations 4 and 5 should be changed to "Remove".


回答1:


You can define the function:

replace.na <- function(r,val) {
  i <- is.na(r)
  j <- which(i)
  k <- which(!i)
  r[j[j > k[length(k)]]] <- val
  r
}

Then, assuming that you have a data.frame like so:

r <- data.frame(ID=c('A','B'),obs1=c(NA,1),obs2=c(0,NA),obs3=c(1,2),obs4=c(NA,3),obs5=c(NA,NA))
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1   NA   NA
##2  B    1   NA    2    3   NA

We can apply the function over the rows for all numeric columns of r:

r[,-1] <- t(apply(r[,-1],1,replace.na,999))    
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1  999  999
##2  B    1   NA    2    3  999

This treats r[,-1] as a matrix and the output of apply fills a matrix, which by default is filled by columns. Therefore, we have to transpose the resulting matrix before replacing the columns back into r.

Another way to call replace.na is:

r[,-1] <- do.call(rbind,lapply(data.frame(t(r[,-1])),replace.na,999))

Here, we transpose the numeric columns of r first and make that a data.frame. This makes each row of r a column in the list of columns that is the resulting data frame. Then use lapply over these columns to apply replace.na and rbind the results.


If you want to flag all NA's after the first non-NA, then the function replace.na should be:

replace.na <- function(r,val) {
  i <- is.na(r)
  j <- which(i)
  k <- which(!i)
  r[j[j > k[1]]] <- val
  r
}

Applying it to the data:

r[,-1] <- do.call(rbind,lapply(data.frame(t(r[,-1])),replace.na,999))
##  ID obs1 obs2 obs3 obs4 obs5
##1  A   NA    0    1  999  999
##2  B    1  999    2    3  999


来源:https://stackoverflow.com/questions/41184721/identify-nas-in-sequence-row-wise

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!