Merge data.frames with duplicates

前端 未结 3 1418
囚心锁ツ
囚心锁ツ 2020-12-11 20:29

I have many data.frames, for example:

df1 = data.frame(names=c(\'a\',\'b\',\'c\',\'c\',\'d\'),data1=c(1,2,3,4,5))
df2 = data.frame(names=c(\'a\',\'e\',\'e\',         


        
相关标签:
3条回答
  • 2020-12-11 21:16

    First define a function, run.seq, which provides sequence numbers for duplicates since it appears from the output that what is desired is that the ith duplicate of each name in each component of the merge be associated. Then create a list of the data frames and add a run.seq column to each component. Finally use Reduce to merge them all.

    run.seq <- function(x) as.numeric(ave(paste(x), x, FUN = seq_along))
    
    L <- list(df1, df2, df3)
    L2 <- lapply(L, function(x) cbind(x, run.seq = run.seq(x$names)))
    
    out <- Reduce(function(...) merge(..., all = TRUE), L2)[-2]
    

    The last line gives:

    > out
      names data1 data2 data3
    1     a     1     1    NA
    2     b     2    NA    NA
    3     c     3     4     1
    4     c     4     5    NA
    5     d     5     6    NA
    6     e    NA     2     2
    7     e    NA     3    NA
    

    EDIT: Revised run.seq so that input need not be sorted.

    0 讨论(0)
  • 2020-12-11 21:23

    I think there is just not enough information in your example data frames to do this. Which 'c' in dataframe 1 should be paired with which 'c' in data frame 2? We cannot tell, so R can't either. I suspect you will have to add another variable to each of your dataframes that uniquely identifies these duplicate cases.

    0 讨论(0)
  • 2020-12-11 21:25

    See other questions:

    • How to join data frames in R (inner, outer, left, right)
    • recombining-a-list-of-data-frames-into-a-single-data-frame
    • ...

    Examples:

    library(reshape)
    out <- merge_recurse(L)
    

    or

    library(plyr)
    
    out<-join(df1, df2, type="full")
    out<-join(out, df3, type="full")
    *can be looped
    

    or

    library(plyr)
    out<-ldply(L)
    
    0 讨论(0)
提交回复
热议问题