Merging two columns into one in R

后端 未结 7 903
情深已故
情深已故 2020-11-29 02:06

I have the following data frame, and am trying to merge the two columns into one, while replacing NA\'s with the numeric values.

ID    A     B
1         


        
相关标签:
7条回答
  • 2020-11-29 02:23

    You could try

    New <- do.call(pmax, c(df1[-1], na.rm=TRUE))
    

    Or

    New <-  df1[-1][cbind(1:nrow(df1),max.col(!is.na(df1[-1])))]
    d1 <- data.frame(ID=df1$ID, New)
    d1
    #  ID New
    #1  1   3
    #2  2   2
    #3  3   4
    #4  4   1
    
    0 讨论(0)
  • 2020-11-29 02:25

    Another very simple solution in this case is to use the rowSums function.

    df$New<-rowSums(df[, c("A", "B")], na.rm=T)
    df<-df[, c("ID", "New")]
    

    Update: Thanks @Artem Klevtsov for mentioning that this method only works with numeric data.

    0 讨论(0)
  • 2020-11-29 02:40

    This probably didn't exist when the answers were written, but since I came here with the same question and found a better solution, here it is for future googlers:

    What you want is the coalesce() function from dplyr:

    y <- c(1, 2, NA, NA, 5)
    z <- c(NA, NA, 3, 4, 5)
    coalesce(y, z)
    
    [1] 1 2 3 4 5
    
    0 讨论(0)
  • 2020-11-29 02:41

    You can use unite from tidyr:

    library(tidyr)
    
    df[is.na(df)] = ''
    unite(df, new, A:B, sep='')
    #  ID new
    #1  1   3
    #2  2   2
    #3  3   4
    #4  4   1
    
    0 讨论(0)
  • 2020-11-29 02:41

    Assuming either A or B have a NA, that would work just fine:

    # creating initial data frame (actually data.table in this case)
    library(data.table)
    x<- as.data.table(list(ID = c(1,2,3,4), A = c(3, NA, NA, 1), B = c(NA, 2, 4, NA)))
    x
    #   ID  A  B
    #1:  1  3 NA
    #2:  2 NA  2
    #3:  3 NA  4
    #4:  4  1 NA
    
    
    #solution
    y[,New := na.omit(c(A,B)), by = ID][,c("A","B"):=NULL]
    y
    #   ID New
    #1:  1   3
    #2:  2   2
    #3:  3   4
    #4:  4   1
    
    0 讨论(0)
  • 2020-11-29 02:43

    This question's been around for a while, but just to add another possible approach that does not depend on any libraries:

    df$new = t(df[-1])[!is.na(t(df[-1]))]
    
    #   ID  A  B new
    # 1  1  3 NA   3
    # 2  2 NA  2   2
    # 3  3 NA  4   4
    # 4  4  1 NA   1
    
    0 讨论(0)
提交回复
热议问题