Factors in R: more than an annoyance?

后端 未结 7 1484
梦如初夏
梦如初夏 2020-11-28 19:30

One of the basic data types in R is factors. In my experience factors are basically a pain and I never use them. I always convert to characters. I feel oddly like I\'m missi

7条回答
  •  情歌与酒
    2020-11-28 20:06

    Factors are fantastic when one is doing statistical analysis and actually exploring the data. However, prior to that when one is reading, cleaning, troubleshooting, merging and generally manipulating the data, factors are a total pain. More recently, as in the past few years a lot of the functions have improved to handle the factors better. For instance, rbind plays nicely with them. I still find it a total nuisance to have left over empty levels after a subset function.

    #drop a whole bunch of unused levels from a whole bunch of columns that are factors using gdata
    require(gdata)
    drop.levels(dataframe)
    

    I know that it is straightforward to recode levels of a factor and to rejig the labels and there are also wonderful ways to reorder the levels. My brain just cannot remember them and I have to relearn it every time I use it. Recoding should just be a lot easier than it is.

    R's string functions are quite easy and logical to use. So when manipulating I generally prefer characters over factors.

提交回复
热议问题