Converting factors to binary in R

前端 未结 4 1502
眼角桃花
眼角桃花 2020-12-01 19:08

I am trying to convert a factor variable into binary / boolean (0 or 1).

Sample data:

df  <-data.frame(a = c(1,2,3), b = c(1,1,2), c = c(\"Rose\",         


        
4条回答
  •  猫巷女王i
    2020-12-01 19:41

    In base R, you could use sapply() on the levels, using == to check for presence and as.integer() to coerce it to binary.

    cbind(df[1:2], sapply(levels(df$c), function(x) as.integer(x == df$c)), df[4])
    #   a b Pink Red Rose d
    # 1 1 1    0   0    1 2
    # 2 2 1    1   0    0 3
    # 3 3 2    0   1    0 4
    

    But since you have a million rows, you may want to go with data.table.

    library(data.table)
    setDT(df)[, c(levels(df$c), "c") := 
        c(lapply(levels(c), function(x) as.integer(x == c)), .(NULL))]
    

    which gives

    df
    #    a b d Pink Red Rose
    # 1: 1 1 2    0   0    1
    # 2: 2 1 3    1   0    0
    # 3: 3 2 4    0   1    0
    

    And you can reset the column order if you need to with setcolorder(df, c(1, 2, 4:6, 3)).

提交回复
热议问题