Merge multiple data tables with duplicate column names

后端 未结 6 1418
感情败类
感情败类 2020-12-29 10:19

I am trying to merge (join) multiple data tables (obtained with fread from 5 csv files) to form a single data table. I get an error when I try to merge 5 data tables, but wo

6条回答
  •  太阳男子
    2020-12-29 11:00

    stack and reshape I don't think this maps exactly to the merge function but...

    mycols <- "x"
    DTlist <- list(DT1,DT2,DT3,DT4,DT5)
    
    dcast(rbindlist(DTlist,idcol=TRUE), paste0(paste0(mycols,collapse="+"),"~.id"))
    
    #    x  1  2  3  4  5
    # 1: a 10 11 12 13 14
    # 2: b 11 12 13 14 15
    # 3: c 12 13 14 15 16
    # 4: d 13 14 15 16 17
    # 5: e 14 15 16 17 18
    # 6: f 15 16 17 18 19
    

    I have no sense for if this would extend to having more columns than y.

    merge-assign

    DT <- Reduce(function(...) merge(..., all = TRUE, by = mycols), 
      lapply(DTlist,`[.noquote`,mycols))
    
    for (k in seq_along(DTlist)){
      js = setdiff( names(DTlist[[k]]), mycols )
      DT[DTlist[[k]], paste0(js,".",k) := mget(paste0("i.",js)), on=mycols, by=.EACHI]
    }
    
    #    x y.1 y.2 y.3 y.4 y.5
    # 1: a  10  11  12  13  14
    # 2: b  11  12  13  14  15
    # 3: c  12  13  14  15  16
    # 4: d  13  14  15  16  17
    # 5: e  14  15  16  17  18
    # 6: f  15  16  17  18  19
    

    (I'm not sure if this fully extends to other cases. Hard to say because the OP's example really doesn't demand the full functionality of merge. In the OP's case, with mycols="x" and x being the same across all DT*, obviously a merge is inappropriate, as mentioned by @eddi. The general problem is interesting, though, so that's what I'm trying to attack here.)

提交回复
热议问题