Deleting rows from a data frame that are not present in another data frame in R [duplicate]

笑着哭i 提交于 2019-12-12 07:03:37

问题


I'm new to R but from what I've been reading this one is a bit hard for me. I have two data frames, say DF1 and DF2, both of which have a variable of interest, say idFriends, and I want to create a new data frame where all the rows that do not appear in DF2 are deleted from DF1 based on the values of idFriends.

The thing is that in DF2 each value appears only once while DF1 has thousands of values, many of them repeated. BUT I don't want R to delete repetitions, I just want it to search DF2, see if EACH value of DF1 exists in DF2, and if it doesn't exist delete that row and if it exists leave it as is, and do the same for each row in DF1.

I hope it's clear.


回答1:


Hard to say without a reproducible example, but %in% is probably what you are looking for:

DF1[!DF1$idFriends %in% DF2$idFriends,]



回答2:


dplyr has an semi_join function that does that.

DF1 %>% semi_join(DF2, by = "idFriends") # keep rows with matching ID
DF1 %>% anti_join(DF2, by = "idFriends") # keep rows without matching ID


来源:https://stackoverflow.com/questions/33041351/deleting-rows-from-a-data-frame-that-are-not-present-in-another-data-frame-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!