Dataframes in a list; adding a new variable with name of dataframe

后端 未结 4 1615
猫巷女王i
猫巷女王i 2020-12-14 03:39

I have a list of dataframes which I eventually want to merge while maintaining a record of their original dataframe name or list index. This will allow me to subset etc acro

相关标签:
4条回答
  • 2020-12-14 03:43

    Your first attempt was very close. By using indices instead of values it will work. Your second attempt failed because you didn't name the elements in your list.

    Both solutions below use the fact that lapply can pass extra parameters (mylist) to the function.

    df1 <- data.frame(x=c(1:5),y=c(11:15))
    df2 <- data.frame(x=c(1:5),y=c(11:15))
    mylist <- list(df1=df1,df2=df2) # Name each data.frame!
    # names(mylist) <- c("df1", "df2") # Alternative way of naming...
    
    # Use indices - and pass in mylist
    mylist1 <- lapply(seq_along(mylist), 
            function(i, x){
                x[[i]]$id <- i
                return (x[[i]])
            }, mylist
    )
    
    # Now the names work - but I pass in mylist instead of using portfolio.results.
    mylist2 <- lapply(names(mylist), 
        function(n, x){
            x[[n]]$id <- n
            return (x[[n]])
        }, mylist
    )
    
    0 讨论(0)
  • 2020-12-14 03:54

    names() could work it it had names, but you didn't give it any. It's an unnamed list. You will need ti use numeric indices:

    > for(i in 1:length(mylist) ){ mylist[[i]] <- cbind(mylist[[i]], id=rep(i, nrow(mylist[[i]]) ) ) }
    > mylist
    [[1]]
      x  y id
    1 1 11  1
    2 2 12  1
    3 3 13  1
    4 4 14  1
    5 5 15  1
    
    [[2]]
      x  y id
    1 1 11  2
    2 2 12  2
    3 3 13  2
    4 4 14  2
    5 5 15  2
    
    0 讨论(0)
  • 2020-12-14 03:59

    dlply function form plyr package could be an answer:

    library('plyr')
    df1 <- data.frame(x=c(1:5),y=c(11:15))
    df2 <- data.frame(x=c(1:5),y=c(11:15))
    mylist <- list(df1 = df1, df2 = df2)
    
    all <- ldply(mylist)
    
    0 讨论(0)
  • Personally, I think it's easier to add the names after collapse:

    df1 <- data.frame(x=c(1:5),y=c(11:15))
    df2 <- data.frame(x=c(1:5),y=c(11:15))
    mylist <- list(df1 = df1, df2 = df2)
    
    all <- do.call("rbind", mylist)
    all$id <- rep(names(mylist), sapply(mylist, nrow))
    
    0 讨论(0)
提交回复
热议问题