Create new columns with dummies based on values [duplicate]

让人想犯罪 __ 提交于 2019-12-23 02:42:11

问题


I want to make new columns, based on the values of a single existing column. It is event data (from a website), so the number of values are different. Just like this:

row    Events 
1       237,2,236,102,106,111,114,115,116,117,118,119,125
2       237,111,116
3       102,106,111,114,115
4       237,2,236,102,106,111,114,115,116,117,118,119,125, 126

The result should be dummy data, based on the different values.

row   237  2  236  102  106  111  114  115  116  117 118  119 125  126
1     1    1   1    1    1    1    1    1    1    1   1    1   1   0
2     1    0   0    0    0    1    0    0    1    0   0    0   0   0  
3     0    0   0    1    1    1    1    1    0    0   0    0   0   0
4     0    0   0    1    1    1    1    1    0    0   0    0   0   1

I tried to solve this with the tidyr separate function, in combination with the function "createDummyFeatures" (MLR package). But, I had to name the columns manually (and ideally it should take the name of the value, just as in the example).


回答1:


We can use the table approach after splitting by , and converting it to a data.frame with stack

table(stack(setNames(strsplit(df1$Event, ","), df1$row))[2:1])

data

df1 <- structure(list(row = 1:4, 
 Events = c("237,2,236,102,106,111,114,115,116,117,118,119,125", 
 "237,111,116", "102,106,111,114,115", 
 "237,2,236,102,106,111,114,115,116,117,118,119,125, 126"
)), .Names = c("row", "Events"), class = "data.frame", row.names = c(NA, 
 -4L))


来源:https://stackoverflow.com/questions/48165158/create-new-columns-with-dummies-based-on-values

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!