Aggregating rows with same Ids and retaining only unique entries in R

隐身守侯 提交于 2019-12-10 10:37:17

问题


I am a beginner in R. I have data frame in R as follows:

 Id          Values
A_0_d   Low_5524; Low_6412; Hi_50567
A_0_d   Low_5509; Low_6412; Low_6897; Hi_16021
A_0_d   Low_5524; Low_4930; Low_5886
B_1_d   Low_3697; Low_4519; Low_5524
C_3_d   Low_5576; Low_5581
C_3_d   Hi_30246
C_3_d   Low_5576; Hi_30246

I would like aggregate the data frame based on the Ids i.e. group all the values of same ID in a single row and retain only unique entries like follows:

A_0_d   Low_5524; Low_6412; Hi_50567; Low_5509; Low_6897; Hi_16021; Low_4930; Low_5886  
B_1_d   Low_3697; Low_4519; Low_5524
C_3_d   Low_5576; Low_5581; Hi_30246 

Can I make use of aggregate function.Kindly guide me.


回答1:


Convert the 'data.frame' to 'data.table' (setDT(df1)). Then, split the 'Values' by "; " after grouping by 'Ids', unlist the output, get the unique elements and paste it together

library(data.table)
setDT(df1)[, .(Values = paste(unique(unlist(strsplit(Values, "; "))), 
                                                 collapse="; ")), by = Id]
#   Id
#1: A_0_d
#2: B_1_d
#3: C_3_d
#                                                                           Values
#1: Low_5524; Low_6412; Hi_50567; Low_5509; Low_6897; Hi_16021; Low_4930; Low_5886
#2:                                                   Low_3697; Low_4519; Low_5524
#3:                                                   Low_5576; Low_5581; Hi_30246



回答2:


Using aggregate you could try this

aggregate(Values ~ Id, df, function(x) paste(unique(x), collapse = '; '))



回答3:


Using aggregate you can

aggregate(Values~Id, df, function(x) unique(unlist(strsplit(x, ";"))))

#   Id                                                                         Values
#1  A_0_d Low_5524, Low_6412, Hi_50567, Low_5509, Low_6897, Hi_16021, Low_4930, Low_5886
#2 B_1_d                                                   Low_3697, Low_4519, Low_5524
#3 C_3_d                                                   Low_5576, Low_5581, Hi_30246


来源:https://stackoverflow.com/questions/40057781/aggregating-rows-with-same-ids-and-retaining-only-unique-entries-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!