Surely this is not intended? Is this something that happens in other parts of dplyr\'s functionality and should I be concerned? I love the performance and hat
From the dplyr documentation:
left_join()returns all rows from
x, and all columns fromxandy. Rows inxwith no match inywill haveNAvalues in the new columns. If there are multiple matches betweenxandy, all combinations of the matches are returned.
semi_join()returns all rows from
xwhere there are matching values iny, keeping just columns fromx.A semi join differs from an inner join because an inner join will return one row of
xfor each matching row ofy, where a semi join will never duplicate rows ofx.
Is semi_join() a valuable option for you?