Subsetting a data frame to the rows not appearing in another data frame

后端 未结 5 954
温柔的废话
温柔的废话 2021-01-23 13:53

I have a data frame A with observations

    Var1   Var2  Var3
     1       3    4
     2       5    6
     4       5    7
     4       5    8
     6       7    9         


        
5条回答
  •  没有蜡笔的小新
    2021-01-23 14:22

    One approach could be to paste all the columns of A and B together, limiting to the rows in A whose pasted representation doesn't appear in the pasted representation of B:

    A[!(do.call(paste, A) %in% do.call(paste, B)),]
    #   Var1 Var2 Var3
    # 3    4    5    7
    # 4    4    5    8
    # 5    6    7    9
    

    One obvious downside of this approach is that it assumes two rows with the same pasted representation are in fact identical. Here is a slightly more clunky approach that doesn't have this limitation:

    combined <- rbind(B, A)
    combined[!duplicated(combined) & seq_len(nrow(combined)) > length(B),]
    #   Var1 Var2 Var3
    # 5    4    5    7
    # 6    4    5    8
    # 7    6    7    9
    

    Basically I used rbind to append A below B and then limited to rows that are both non-duplicated and that are not originally from B.

提交回复
热议问题