Split a column of concatenated comma-delimited data and recode output as factors

前端 未结 2 812
-上瘾入骨i
-上瘾入骨i 2020-11-27 08:09

I am trying to clean up some data that has been incorrectly entered. The question for the variable allows for multiple responses out of five choices, numbered as 1 to 5. The

2条回答
  •  南笙
    南笙 (楼主)
    2020-11-27 08:24

    You just need to write a function and use apply. First some dummy data:

    ##Make sure you're not using factors
    dd = data.frame(V1 = c("1, 2, 3", "1, 2, 4", "2, 3, 4, 5", 
                             "1, 3, 4", "1, 3, 5", "2, 3, 4, 5"), 
                         stringsAsFactors=FALSE)
    

    Next, create a function that takes in a row and transforms as necessary

    make_row = function(i, ncol=5) {
      ##Could make the default NA if needed
      m = numeric(ncol)
      v = as.numeric(strsplit(i, ",")[[1]])
      m[v] = 1
      return(m)
    }
    

    Then use apply and transpose the result

    t(apply(dd, 1, make_row))
    

提交回复
热议问题