Convert multiple binary columns to single categorical column [duplicate]

╄→гoц情女王★ 提交于 2019-12-02 04:12:21

You can get the values by making use of the column names and as.logical. However, since your "binary" columns are factors, you need to go though a few more hoops:

> apply(data[-1], 1, function(x) names(x)[as.logical(as.numeric(as.character(x)))])
[1] "red"    "blue"   "blue"   "blue"   "red"    "blue"   "blue"   "blue"   "yellow"

Bind this back with the first column (data[1]) to get the output you want.

cbind(data[1], 
      color = apply(data[-1], 1, 
                    function(x) names(x)[as.logical(as.numeric(
                      as.character(x)))]))
#   id  color
# 1  1    red
# 2  2   blue
# 3  3   blue
# 4  4   blue
# 5  5    red
# 6  6   blue
# 7  7   blue
# 8  8   blue
# 9  9 yellow

Alternatively, you can try the following:

data[-1] <- lapply(data[-1], function(x) as.numeric(as.character(x)))
temp <- subset(cbind(data[1], stack(data[-1])), values == 1, c("id", "ind"))
temp[order(temp$id), ]

Or, you can use a combination of "dplyr" and "tidyr", like this:

library(dplyr)
library(tidyr)

data %>%
  group_by(id) %>%
  mutate_each(funs(an = as.numeric(as.character(.)))) %>%
  gather(color, val, -id) %>%
  filter(val == 1) %>%
  select(-val) %>%
  arrange(id)
# Source: local data frame [9 x 2]
# 
#   id  color
# 1  1    red
# 2  2   blue
# 3  3   blue
# 4  4   blue
# 5  5    red
# 6  6   blue
# 7  7   blue
# 8  8   blue
# 9  9 yellow

Here's a simple base R vectorized solution using max.col

cbind(data[1L], color = names(data[-1L])[max.col(data[-1L] == 1L)])
#   id  color
# 1  1    red
# 2  2   blue
# 3  3   blue
# 4  4   blue
# 5  5    red
# 6  6   blue
# 7  7   blue
# 8  8   blue
# 9  9 yellow
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!