I want to dummy code i.e. create flag variables for column Species.
I wrote the below code:
create_dummies <- function(data, categorical_preds){
If else
should be used when you build function, to run certain parts of function given when given codition is true (one condition, length==1) . ifelse
you should use in transforming your data.frame.
Help on if else
:
cond A length-one logical vector that is not NA. Conditions of length greater than one are accepted with a warning, but only the first element is used. Other types are coerced to logical if possible, ignoring any class.
For this purpose (if vector is factor) you can use model.matrix to create dummy variables.
mat<-model.matrix(~iris$Species-1)
mat<-as.data.frame(mat)
names(mat)<-unique(iris$Species)
> str(mat)
'data.frame': 150 obs. of 3 variables:
$ setosa : num 1 1 1 1 1 1 1 1 1 1 ...
$ versicolor: num 0 0 0 0 0 0 0 0 0 0 ...
$ virginica : num 0 0 0 0 0 0 0 0 0 0 ...
The warning message:
the condition has length > 1 and only the first element will be used
tells you that using a vector in if
condition is equivalent to use its first element :
[if (v == 1)] ~ [if (v[1] == 1)] ## v here is a vector
You should use the vectorized ifelse
. For example you can write your condition like this:
create_dummies<-function(data, categorical_preds){
## here I show only the first condition
data$setosa_flg <-
ifelse (categorical_preds=="setosa",1,0)
data
}
iris$Species
is a vector. An if
statement is a control statement designed to work only on a scalar boolean condition. In R, when you compare a vector with a string, the output is a vector of booleans telling whether each element of the vector is equal to the string.