R dplyr left join - multiple returned values and new rows: how to ask for the first match only?

时间秒杀一切 提交于 2021-02-06 09:34:24

问题


Let's say I have a list of suburb names, crime rate and their council names on a separate table.

Tables Picture

I know that left_join(table1, table2, by=Suburb) will return the table with newly added rows due to the multiple matches for council. The problem is that suburbs 3 and 4 overlap into two councils.

Is there a way to only get the left join to only return the first match only rather than creating new rows to facilitate for the extra ones?

In addition, on Table 2, is there a function to only keep the first row of each suburb and remove the second/third/fourth instances where the second/third/fourth council overlapping occurs?


回答1:


You can do this using the plyr package and the join() function. The equivalent to left_join(table1, table2, by=Suburb) but only using the first Suburb match from table2 would be: join(table1, table2, by=Suburb, type="left", match="first"). I'm not sure what the equivalent is in the dplyr package, though I would love to know myself.



来源:https://stackoverflow.com/questions/42431974/r-dplyr-left-join-multiple-returned-values-and-new-rows-how-to-ask-for-the-fi

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!