Select only the first row when merging data frames with multiple matches

后端 未结 4 764
醉梦人生
醉梦人生 2020-12-03 07:52

I have two data frames, \"data\" and \"scores\", and want to merge them on the \"id\" column:

data = data.frame(id = c(1,2,3,4,5),
                  state =          


        
4条回答
  •  眼角桃花
    2020-12-03 08:53

    Here is a base R method using aggregate and head:

    merge(data, aggregate(score ~ id, data=scores, head, 1), by="id") 
    

    The aggregate function breaks up the scores dataframe by id, then head is applied to get the first observation from each id. Since aggregate returns a data.frame, this is directly merged onto the data.frame data.


    Probably more efficient is to subset the scores data.frame using duplicated which will achieve the same result as aggregate, but will reduce the computational overhead.

    merge(data, scores[!duplicated(scores$id),], by="id")
    

提交回复
热议问题