How to combine duplicate rows in a data frame in R [duplicate]

天大地大妈咪最大 提交于 2020-06-13 12:22:31

问题


Given a dataframe (my_data) in R such as the following

category  Keyword1 Keyword2 Keyword3 Keyword4 Keyword5 Keyword6 Keyword7 Keyword8
123         0        1         1       0         0        0       0         1
155         1        0         0       0         1        0       1         1
144         0        0         1       0         0        0       1         1
123         1        1         0       0         0        0       1         1

I want to transform the dataframe by taking rows with category id values that already exist (e.g category 123) and combine them. The result should look like:

category Keyword1 Keyword2 Keyword3 Keyword4 Keyword5 Keyword6 Keyword7 Keyword8
123         1        1         1       0         0        0       0         1
155         1        0         0       0         1        0       1         1
144         0        0         1       0         0        0       1         1

How can I do this in R ?


回答1:


You can use dplyr, which is useful for many other such use cases as follows:

library(dplyr)
my_data %>% group_by(category) %>% summarise_each(funs(max)) 

Output is:

# A tibble: 3 × 9
  category Keyword1 Keyword2 Keyword3 Keyword4 Keyword5 Keyword6 Keyword7 Keyword8
     <int>    <int>    <int>    <int>    <int>    <int>    <int>    <int>    <int>
1      123        1        1        1        0        0        0        1        1
2      144        0        0        1        0        0        0        1        1
3      155        1        0        0        0        1        0        1        1


来源:https://stackoverflow.com/questions/41060599/how-to-combine-duplicate-rows-in-a-data-frame-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!