Factors in R: more than an annoyance?

后端 未结 7 1479
梦如初夏
梦如初夏 2020-11-28 19:30

One of the basic data types in R is factors. In my experience factors are basically a pain and I never use them. I always convert to characters. I feel oddly like I\'m missi

7条回答
  •  独厮守ぢ
    2020-11-28 20:05

    tapply (and aggregate) rely on factors. The information-to-effort ratio of these functions is very high.

    For instance, in a single line of code (the call to tapply below) you can get mean price of diamonds by Cut and Color:

    > data(diamonds, package="ggplot2")
    
    > head(dm)
    
       Carat     Cut    Clarity Price Color
    1  0.23     Ideal     SI2   326     E
    2  0.21   Premium     SI1   326     E
    3  0.23      Good     VS1   327     E
    
    
    > tx = with(diamonds, tapply(X=Price, INDEX=list(Cut=Cut, Color=Color), FUN=mean))
    
    > a = sort(1:diamonds(tx)[2], decreasing=T)  # reverse columns for readability
    
    > tx[,a]
    
             Color
    Cut         J    I    H    G    F    E    D
    Fair      4976 4685 5136 4239 3827 3682 4291
    Good      4574 5079 4276 4123 3496 3424 3405
    Very Good 5104 5256 4535 3873 3779 3215 3470
    Premium   6295 5946 5217 4501 4325 3539 3631
    Ideal     4918 4452 3889 3721 3375 2598 2629
    

提交回复
热议问题