How to select the last one test without NA in r

落爺英雄遲暮 提交于 2019-12-10 23:40:01

问题


My dataframe is similar like this:

Person  W.1   W.2   W.3   W.4   W.5   
1       62    57    52    59    NA
2       49    38    60    NA    NA
3       59    34    NA    NA    NA

Is there a way to select the first and last test without "NA". I have 300 data entries, and W.1 means the first test, W.2 means the second test, W.n means the nth test. I want to compare the score of the first test with the score of the last test. For example, I want to compare:

1    62 59
2    49 60
3    59 34

But different persons have different places having "NA", can someone help me?

Thank you!


回答1:


You can use this solution:

> t(apply(d[-1],1,function(rw) rw[range(which(!is.na(rw)))]))

     [,1] [,2]
[1,]   62   59
[2,]   49   60
[3,]   59   34

where d is your data set.

How it works: for each row of d (rows are scanned using apply(d[-1],1,...), where d[-1] excludes the first column), get the indices of non-NA test results (which(!is.na(rw))), then get the lowest and highest value of indices by using range(), and obtain the test scores that correspond to those indices (rw[...]). The final result is transposed using t().

Note that this solution will work properly even in the case of NAs in the middle of the test scores, e.g. c(NA, 57, NA, 52, NA).




回答2:


Here's a possible vectorized solution using max.col (I'm assuming that the first test is never NA, though it can be easily fixed if otherwise)

indx <- cbind(seq_len(nrow(df)), max.col(!is.na(df), ties.method = "last"))
cbind(df[, 2], df[indx])
#      [,1] [,2]
# [1,]   62   59
# [2,]   49   60
# [3,]   59   34

Another similar solution is to use rowSums

cbind(df[, 2], df[cbind(seq_len(nrow(df)), rowSums(!is.na(df)))])
#      [,1] [,2]
# [1,]   62   59
# [2,]   49   60
# [3,]   59   34


来源:https://stackoverflow.com/questions/28799753/how-to-select-the-last-one-test-without-na-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!