How do I convert certain columns of a data frame to become factors? [duplicate]

北慕城南 提交于 2019-11-28 17:05:14

问题


Possible Duplicate:
identifying or coding unique factors using R

I'm having some trouble with R.

I have a data set similar to the following, but much longer.

A B Pulse
1 2 23
2 2 24
2 2 12
2 3 25
1 1 65
1 3 45

Basically, the first 2 columns are coded. A has 1, 2 which represent 2 different weights. B has 1, 2, 3 which represent 3 different times.

As they are coded numerical values, R will treat them as numerical variables. I need to use the factor function to convert these variables into factors.

Help?


回答1:


Here's an example:

#Create a data frame
> d<- data.frame(a=1:3, b=2:4)
> d
  a b
1 1 2
2 2 3
3 3 4

#currently, there are no levels in the `a` column, since it's numeric as you point out.
> levels(d$a)
NULL

#Convert that column to a factor
> d$a <- factor(d$a)
> d
  a b
1 1 2
2 2 3
3 3 4

#Now it has levels.
> levels(d$a)
[1] "1" "2" "3"

You can also handle this when reading in your data. See the colClasses and stringsAsFactors parameters in e.g. readCSV().

Note that, computationally, factoring such columns won't help you much, and may actually slow down your program (albeit negligibly). Using a factor will require that all values are mapped to IDs behind the scenes, so any print of your data.frame requires a lookup on those levels -- an extra step which takes time.

Factors are great when storing strings which you don't want to store repeatedly, but would rather reference by their ID. Consider storing a more friendly name in such columns to fully benefit from factors.




回答2:


Given the following sample

myData <- data.frame(A=rep(1:2, 3), B=rep(1:3, 2), Pulse=20:25)  

then

myData$A <-as.factor(myData$A)
myData$B <-as.factor(myData$B)

or you could select your columns altogether and wrap it up nicely:

# select columns
cols <- c("A", "B")
myData[,cols] <- data.frame(apply(myData[cols], 2, as.factor))

levels(myData$A) <- c("long", "short")
levels(myData$B) <- c("1kg", "2kg", "3kg")

To obtain

> myData
      A   B Pulse
1  long 1kg    20
2 short 2kg    21
3  long 3kg    22
4 short 1kg    23
5  long 2kg    24
6 short 3kg    25


来源:https://stackoverflow.com/questions/13613913/how-do-i-convert-certain-columns-of-a-data-frame-to-become-factors

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!