Error when subsetting based on adjusted values of different data frame in R

对着背影说爱祢 提交于 2019-12-25 17:08:32

问题


I am asking a side-question about the method I learned here from @redmode :

Subsetting based on values of a different data frame in R

When I try to dynamically adjust the level I want to subset by:

N <- nrow(A)
cond <- sapply(3:N, function(i) sum(A[i,] > 0.95*B[i,])==2)
rbind(A[1:2,], subset(A[3:N,], cond))

I get an error

Error in FUN(left, right) : non-numeric argument to binary operator. 

Can you think of a way I can get rows pertaining to values in A that are greater than 95% of the value in B? Thank you.

Here is code for A and B to play with.

A <- structure(list(name1 = c("trt", "0", "1", "10", "1", "1", "10"
), name2 = c("ctrl", "3", "1", "1", "1", "1", "10")), .Names = c("name1", 
"name2"), row.names = c("cond", "hour", "A", "B", "C", "D", "E"
), class = "data.frame")
B <- structure(list(name1 = c("trt", "0", "1", "1", "1", "1", "9.4"), 
    name2 = c("ctrl", "3", "1", "10", "1", "1", "9.4")), .Names = c("name1", 
"name2"), row.names = c("cond", "hour", "A", "B", "C", "D", "E"
), class = "data.frame")

回答1:


You have some serious formatting issues with your data.

First, columns should be of the same data type, rows should be observations. (not always true, but a very good way to start) Here you have a row called cond, then a row called hour, then a series of classifications I'm guessing. The way you're data is presented to begin with doesn't make much sense and doesn't lend itself to easy manipulation of your data. But all is not lost. This is what I would do:

Reorganize my data:

C <- data.frame(matrix(as.numeric(unlist(A)), ncol=2)[-(1:2), ])

colnames(C) <- c('A.trt', 'A.cntr')
rownames(C) <- LETTERS[1:nrow(C)]

D <- data.frame(matrix(as.numeric(unlist(B)), ncol=2)[-(1:2), ])

colnames(D) <- c('B.trt', 'B.cntr')

(df <- cbind(C, D))

Which gives:

#   A.trt A.cntr B.trt B.cntr
# A     1      1   1.0    1.0
# B    10      1   1.0   10.0
# C     1      1   1.0    1.0
# D     1      1   1.0    1.0
# E    10     10   9.4    9.4

Then you're problem is easily solved by:

df[which(df[, 1] > 0.95*df[, 3] & df[, 2] > 0.95*df[, 4]), ]

#   A.trt A.cntr B.trt B.cntr
# A     1      1   1.0    1.0
# C     1      1   1.0    1.0
# D     1      1   1.0    1.0
# E    10     10   9.4    9.4


来源:https://stackoverflow.com/questions/15147336/error-when-subsetting-based-on-adjusted-values-of-different-data-frame-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!