Recode categorical factor with N categories into N binary columns

后端 未结 7 2132
野性不改
野性不改 2020-12-14 09:12

Original data frame:

v1 = sample(letters[1:3], 10, replace=TRUE)
v2 = sample(letters[1:3], 10, replace=TRUE)
df = data.frame(v1,v2)
df
         


        
7条回答
  •  既然无缘
    2020-12-14 09:29

    Even better with the help of @AnandaMahto's search capabilities,

    model.matrix(~ . + 0, data=df, contrasts.arg = lapply(df, contrasts, contrasts=FALSE))
    #    v1a v1b v1c v2a v2b v2c
    # 1    0   1   0   0   0   1
    # 2    1   0   0   1   0   0
    # 3    0   0   1   0   0   1
    # 4    0   1   0   1   0   0
    # 5    0   0   1   0   0   1
    # 6    0   0   1   0   1   0
    # 7    1   0   0   1   0   0
    # 8    1   0   0   0   1   0
    # 9    1   0   0   0   0   1
    # 10   1   0   0   0   1   0
    

    I think this is what you're looking for. I'd be happy to delete if it's not so. Thanks to @G.Grothendieck (once again) for the excellent usage of model.matrix!

    cbind(with(df, model.matrix(~ v1 + 0)), with(df, model.matrix(~ v2 + 0)))
    #    v1a v1b v1c v2a v2b v2c
    # 1    0   1   0   0   0   1
    # 2    1   0   0   1   0   0
    # 3    0   0   1   0   0   1
    # 4    0   1   0   1   0   0
    # 5    0   0   1   0   0   1
    # 6    0   0   1   0   1   0
    # 7    1   0   0   1   0   0
    # 8    1   0   0   0   1   0
    # 9    1   0   0   0   0   1
    # 10   1   0   0   0   1   0
    

    Note: Your output is just:

    with(df, model.matrix(~ v2 + 0))
    

    Note 2: This gives a matrix. Fairly obvious, but still, wrap it with as.data.frame(.) if you want a data.frame.

提交回复
热议问题