How can I subset rows in a data frame in R based on a vector of values?

前端 未结 4 1889
挽巷
挽巷 2020-11-28 04:48

I have two data sets that are supposed to be the same size but aren\'t. I need to trim the values from A that are not in B and vice versa in order to eliminate noise from a

4条回答
  •  夕颜
    夕颜 (楼主)
    2020-11-28 05:05

    If you really just want to subset each data frame by an index that exists in both data frames, you can do this with the 'match' function, like so:

    data_A[match(data_B$index, data_A$index, nomatch=0),]
    data_B[match(data_A$index, data_B$index, nomatch=0),]
    

    This is, though, the same as:

    data_A[data_A$index %in% data_B$index,]
    data_B[data_B$index %in% data_A$index,]
    

    Here is a demo:

    # Set seed for reproducibility.
    set.seed(1)
    
    # Create two sample data sets.
    data_A <- data.frame(index=sample(1:200, 90, rep=FALSE), value=runif(90))
    data_B <- data.frame(index=sample(1:200, 120, rep=FALSE), value=runif(120))
    
    # Subset data of each data frame by the index in the other.
    t_A <- data_A[match(data_B$index, data_A$index, nomatch=0),]
    t_B <- data_B[match(data_A$index, data_B$index, nomatch=0),]
    
    # Make sure they match.
    data.frame(t_A[order(t_A$index),], t_B[order(t_B$index),])[1:20,]
    
    #    index     value index.1    value.1
    # 27     3 0.7155661       3 0.65887761
    # 10    12 0.6049333      12 0.14362694
    # 88    14 0.7410786      14 0.42021589
    # 56    15 0.4525708      15 0.78101754
    # 38    18 0.2075451      18 0.70277874
    # 24    23 0.4314737      23 0.78218212
    # 34    32 0.1734423      32 0.85508236
    # 22    38 0.7317925      38 0.56426384
    # 84    39 0.3913593      39 0.09485786
    # 5     40 0.7789147      40 0.31248966
    # 74    43 0.7799849      43 0.10910096
    # 71    45 0.2847905      45 0.26787813
    # 57    46 0.1751268      46 0.17719454
    # 25    48 0.1482116      48 0.99607737
    # 81    53 0.6304141      53 0.26721208
    # 60    58 0.8645449      58 0.96920881
    # 30    59 0.6401010      59 0.67371223
    # 75    61 0.8806190      61 0.69882454
    # 63    64 0.3287773      64 0.36918946
    # 19    70 0.9240745      70 0.11350771
    

提交回复
热议问题