Recode categorical factor with N categories into N binary columns

后端 未结 7 2165
野性不改
野性不改 2020-12-14 09:12

Original data frame:

v1 = sample(letters[1:3], 10, replace=TRUE)
v2 = sample(letters[1:3], 10, replace=TRUE)
df = data.frame(v1,v2)
df
         


        
7条回答
  •  北荒
    北荒 (楼主)
    2020-12-14 09:19

    A fairly direct approach is to just use table on each column, tabulating the values in the column by the number of rows in the data.frame:

    allLevels <- levels(factor(unlist(df)))
    do.call(cbind, 
            lapply(df, function(x) table(sequence(nrow(df)), 
                                         factor(x, levels = allLevels))))
    #    a b c a b c
    # 1  0 1 0 0 0 1
    # 2  1 0 0 1 0 0
    # 3  0 0 1 0 0 1
    # 4  0 1 0 1 0 0
    # 5  0 0 1 0 0 1
    # 6  0 0 1 0 1 0
    # 7  1 0 0 1 0 0
    # 8  1 0 0 0 1 0
    # 9  1 0 0 0 0 1
    # 10 1 0 0 0 1 0
    

    I've used factor on "x" to make sure that even in cases where there are, say, no "c" values in a column, there will still be a "c" column in the output, filled with zeroes.

提交回复
热议问题