Reason for unexpected output in subsetting data frame - R

时光怂恿深爱的人放手 提交于 2019-12-02 00:29:45

Working through an example shows where it is going wrong:

a <- data.frame(VAL=c(1,1,1,23,24))
a
#  VAL
#1   1
#2   1
#3   1
#4  23
#5  24

These work:

a$VAL %in% c(23,24)
#[1] FALSE FALSE FALSE  TRUE  TRUE
a$VAL==23 | a$VAL==24
#[1] FALSE FALSE FALSE  TRUE  TRUE

The following doesn't work due to vector recycling when comparing - take note of the warning message below E.g.:

a$VAL ==c(23,24)
#[1] FALSE FALSE FALSE FALSE FALSE
#Warning message:
#In a$VAL == c(23, 24) :
#  longer object length is not a multiple of shorter object length

This last bit of code recycles what you are testing against and is basically comparing:

c( 1,  1,  1, 23, 24) #to
c(23, 24, 23, 24, 23)

...so you don't get any rows returned. Changing the order will give you

c( 1,  1,  1, 23, 24) #to
c(24, 23, 24, 23, 24)

...and you will get two rows returned (which gives the intended result by pure luck, but it is not appropriate to use).

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!