Create new dummy variable columns from categorical variable

后端 未结 8 1080
礼貌的吻别
礼貌的吻别 2020-11-28 04:03

I have a several data sets with 75,000 observations and a type variable that can take on a value 0-4. I want to add five new dummy variables to each data set f

8条回答
  •  悲哀的现实
    2020-11-28 04:56

    R has a "sub-language" to translate formulas into design matrix, and in the spirit of the language you can take advantage of it. It's fast and concise. Example: you have a cardinal predictor x, a categorical predictor catVar, and a response y.

    > binom <- data.frame(y=runif(1e5), x=runif(1e5), catVar=as.factor(sample(0:4,1e5,TRUE)))
    > head(binom)
              y          x catVar
    1 0.5051653 0.34888390      2
    2 0.4868774 0.85005067      2
    3 0.3324482 0.58467798      2
    4 0.2966733 0.05510749      3
    5 0.5695851 0.96237936      1
    6 0.8358417 0.06367418      2
    

    You just do

    > A <- model.matrix(y ~ x + catVar,binom) 
    > head(A)
      (Intercept)          x catVar1 catVar2 catVar3 catVar4
    1           1 0.34888390       0       1       0       0
    2           1 0.85005067       0       1       0       0
    3           1 0.58467798       0       1       0       0
    4           1 0.05510749       0       0       1       0
    5           1 0.96237936       1       0       0       0
    6           1 0.06367418       0       1       0       0
    

    Done.

提交回复
热议问题