How to create dummy variables?

后端 未结 3 2010
梦谈多话
梦谈多话 2020-12-07 02:55

I have a variable that is a factor :

 $ year           : Factor w/ 8 levels \"2003\",\"2004\",..: 4 6 4 2 4 1 3 3 7 2 ...

I would like to c

3条回答
  •  太阳男子
    2020-12-07 03:16

    library(caret) provides a very simple function (dummyVars) to create dummy variables, especially when you have more than one factor variables. But you have to make sure the target variables are factor. e.g. if your Sales$year are numeric, you have to convert them to factor: as.factor(Sales$year)

    Suppose we have the original dataset 'Sales' as follows:

        year    Sales       Region
    1   2010    3695.543    North
    2   2010    9873.037    West
    3   2008    3579.458    West
    4   2005    2788.857    North
    5   2005    2952.183    North
    6   2008    7255.337    West
    7   2005    5237.081    West
    8   2010    8987.096    North
    9   2008    5545.343    North
    10  2008    1809.446    West
    

    Now we can create two dummy variables simultaneously:

    >library(lattice)
    >library(ggplot2)
    >library(caret)
    >Salesdummy <- dummyVars(~., data = Sales, levelsOnly = TRUE)
    >Sdummy <- predict(Salesdummy, Sales)
    

    The outcome will be:

       2005 2008 2010   Sales    RegionNorth    RegionWest
    1   0    0    1   3695.543       1              0
    2   0    0    1   9873.037       0              1
    3   0    1    0   3579.458       0              1
    4   1    0    0   2788.857       1              0
    5   1    0    0   2952.183       1              0
    6   0    1    0   7255.337       0              1
    7   1    0    0   5237.081       0              1
    8   0    0    1   8987.096       1              0
    9   0    1    0   5545.343       1              0 
    10  0    1    0   1809.446       0              1
    

提交回复
热议问题