Reshaping a data.frame so a column containing multiple features becomes multiple binary columns

后端 未结 4 783
滥情空心
滥情空心 2021-01-27 08:06

I have a dataframe like this

df <-data.frame(id = c(1,2),
                value = c(25,24),
                features = c(\"A,B,D,F\",\"C,B,E\"))

print(df)

i         


        
4条回答
  •  执念已碎
    2021-01-27 08:46

    Another one using splitstackshape and data.table (installation instructions here):

    require(splitstackshape)
    require(data.table) # v1.9.5+
    ans <- cSplit(df, 'features', sep = ',', 'long')
    dcast(ans, id + value ~ features, fun.aggregate = length)
    #    id value A B C D E F
    # 1:  1    25 1 1 0 1 0 1
    # 2:  2    24 0 1 1 0 1 0
    

    If you're using data.table v1.9.4, then replace dcast with dcast.data.table.

    Alternatively, you can use cSplit_e, like this:

    cSplit_e(df, "features", ",", type = "character", fill = 0)
    ##   id value features features_A features_B features_C features_D features_E features_F
    ## 1  1    25  A,B,D,F          1          1          0          1          0          1
    ## 2  2    24    C,B,E          0          1          1          0          1          0
    

提交回复
热议问题