Using one data.frame to update another

后端 未结 7 445

Given 2 data frames that are identical in terms of column names/datatypes, where some columns uniquely identify the rows, is there an efficient function/method for one data.

7条回答
  •  梦毁少年i
    2020-12-20 12:41

    I produced a function that uses the method of indexing (see answer by John Colby above). Hopefully it can be useful for all such needs of updating one data frame with the values from another data frame.

    update.df.with.df <- function(original, replacement, key, value) 
    {
        ## PURPOSE: Update a data frame with the values in another data frame
        ## ----------------------------------------------------------------------
        ## ARGUMENT:
        ##   original: a data frame to update,
        ##   replacement: a data frame that has the updated values,
        ##   key: a character vector of variable names to form the unique key
        ##   value: a character vector of variable names to form the values that need to be updated
        ## ----------------------------------------------------------------------
        ## RETURN: The updated data frame from the old data frame "original". 
        ## ----------------------------------------------------------------------
        ## AUTHOR: Feiming Chen,  Date:  2 Dec 2015, 15:08
    
        n1 <- rownames(original) <- apply(original[, key, drop=F], 1, paste, collapse=".")
        n2 <- rownames(replacement) <- apply(replacement[, key, drop=F], 1, paste, collapse=".")
    
        n3 <- merge(data.frame(n=n1), data.frame(n=n2))[[1]] # make common keys
        n4 <- levels(n3)[n3]                # convert factor to character
    
        original[n4, value] <- replacement[n4, value] # update values on the common keys
        original
    }
    if (F) {                                # Unit Test 
        original <- data.frame(x=c(1, 2, 3), y=c(10, 20, 30))
        replacement <- data.frame(x=2, y=25)
        update.df.with.df(original, replacement, key="x", value="y") # data.frame(x=c(1, 2, 3), y=c(10, 25, 30))
    
        original <- data.frame(x=c(1, 2, 3), w=c("a", "b", "c"), y=c(10, 20, 30))
        replacement <- data.frame(x=2, w="b", y=25)
        update.df.with.df(original, replacement, key=c("x", "w"), value="y") # data.frame(x=c(1, 2, 3), w=c("a", "b", "c"), y=c(10, 25, 30))
    
        original = data.frame(Name = c("joe","john") , Id = c( 1 , 2) , Value1 = c(1.2,NA), Value2 = c(NA,9.2))
        replacement = data.frame(Name = c("john") , Id = 2 , Value1 = 2.2 , Value2 = 5.9)
        update.df.with.df(original, replacement, key="Id", value=c("Value1", "Value2"))
        ## goal = data.frame( Name = c("joe","john") , Id = c( 1 , 2) , Value1 = c(1.2,2.2), Value2 = c(NA,5.9) )
    }
    

提交回复
热议问题