How can I compare two factors with different levels?

大城市里の小女人 提交于 2019-11-27 08:13:12

问题


Is it possible to compare two factors of same length, but different levels? For example, if we have these 2 factor variables:

A <- factor(1:5)

str(A)
 Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5

B <- factor(c(1:3,6,6))

str(B)
 Factor w/ 4 levels "1","2","3","6": 1 2 3 4 4

If I try to compare them using, for example, the == operator:

mean(A == B)

I get the following error:

Error in Ops.factor(A, B) : level sets of factors are different


回答1:


Convert to character then compare:

# data
A <- factor(1:5)
B <- factor(c(1:3,6,6))

str(A)
# Factor w/ 5 levels "1","2","3","4",..: 1 2 3 4 5
str(B)
# Factor w/ 4 levels "1","2","3","6": 1 2 3 4 4

mean(A == B)

Error in Ops.factor(A, B) : level sets of factors are different

mean(as.character(A) == as.character(B))
# [1] 0.6

Or another approach would be

mean(levels(A)[A] == levels(B)[B])

which is 2 times slower on a 1e8 dataset.




回答2:


Converting to character as in @zx8754's answer is the easiest solution to this problem, and probably the one you'd want to use almost always. Another option, though, is to correct the 2 variables so that they have the same levels. You might want to do this if you want to keep these variables as factor for some reason and don't want to have to clog up your code with repeated calls to as.character.

A <- factor(1:5)
B <- factor(c(1:3,6,6))

mean(A == B)
Error in Ops.factor(A, B) : level sets of factors are different

We can take the union of the levels of both factors to get all levels in either factor, and then set remake the factors using that union as the levels. Now, even though the 2 factors have different values, the levels are the same between them and you can compare them:

C = factor(A, levels = union(levels(A), levels(B)))
D = factor(B, levels = union(levels(A), levels(B)))

mean(C==D)
[1] 0.6

As you can see, the values are unchanged, but the levels are now identical.

C
[1] 1 2 3 4 5
Levels: 1 2 3 4 5 6

D
[1] 1 2 3 6 6
Levels: 1 2 3 4 5 6


来源:https://stackoverflow.com/questions/37962082/how-can-i-compare-two-factors-with-different-levels

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!