I have a variable that is a factor :
$ year : Factor w/ 8 levels \"2003\",\"2004\",..: 4 6 4 2 4 1 3 3 7 2 ...
I would like to c
library(caret)
provides a very simple function (dummyVars
) to create dummy variables, especially when you have more than one factor variables. But you have to make sure the target variables are factor. e.g. if your Sales$year
are numeric, you have to convert them to factor: as.factor(Sales$year)
Suppose we have the original dataset 'Sales' as follows:
year Sales Region
1 2010 3695.543 North
2 2010 9873.037 West
3 2008 3579.458 West
4 2005 2788.857 North
5 2005 2952.183 North
6 2008 7255.337 West
7 2005 5237.081 West
8 2010 8987.096 North
9 2008 5545.343 North
10 2008 1809.446 West
Now we can create two dummy variables simultaneously:
>library(lattice)
>library(ggplot2)
>library(caret)
>Salesdummy <- dummyVars(~., data = Sales, levelsOnly = TRUE)
>Sdummy <- predict(Salesdummy, Sales)
The outcome will be:
2005 2008 2010 Sales RegionNorth RegionWest
1 0 0 1 3695.543 1 0
2 0 0 1 9873.037 0 1
3 0 1 0 3579.458 0 1
4 1 0 0 2788.857 1 0
5 1 0 0 2952.183 1 0
6 0 1 0 7255.337 0 1
7 1 0 0 5237.081 0 1
8 0 0 1 8987.096 1 0
9 0 1 0 5545.343 1 0
10 0 1 0 1809.446 0 1