Replace a value NA with the value from another column in R

后端 未结 5 766
日久生厌
日久生厌 2020-12-03 15:52

I want to replace the NA value in dfABy from the column A, with the value from the column B, based on the year of column year. For example, my df is:

               


        
相关标签:
5条回答
  • 2020-12-03 16:02

    Perhaps the easiest to read/understand answer in R lexicon is to use ifelse. So borrowing Richard's dataframe we could do:

    df <- structure(list(A = c(56L, NA, NA, 67L, NA),
                         B = c(75L, 45L, 77L, 41L, 65L),
                         Year = c(1921L, 1921L, 1922L, 1923L, 1923L)),.Names = c("A", 
                                                                                                                                "B", "Year"), class = "data.frame", row.names = c(NA, -5L))
    df$A <- ifelse(is.na(df$A), df$B, df$A)
    
    0 讨论(0)
  • 2020-12-03 16:03

    The new dplyr function, coalesce, can really simplify these situations.

    library(dplyr)
    
    dfABy %>% 
        coalesce(A,B)
    
    0 讨论(0)
  • 2020-12-03 16:08

    You could use simple replacement with [<-, subsetting for the NA elements.

    df$A[is.na(df$A)] <- df$B[is.na(df$A)]
    

    Or alternatively, within()

    within(df, A[is.na(A)] <- B[is.na(A)])
    

    Both give

       A  B Year
    1 56 75 1921
    2 45 45 1921
    3 77 77 1922
    4 67 41 1923
    5 65 65 1923
    

    Data:

    df <- structure(list(A = c(56L, NA, NA, 67L, NA), B = c(75L, 45L, 77L, 
    41L, 65L), Year = c(1921L, 1921L, 1922L, 1923L, 1923L)), .Names = c("A", 
    "B", "Year"), class = "data.frame", row.names = c(NA, -5L))
    
    0 讨论(0)
  • 2020-12-03 16:12

    Easy

    library(dplyr)
    
    dfABy %>%
      mutate(A_new = 
               A %>% 
                 is.na %>%
                 ifelse(B, A) )
    
    0 讨论(0)
  • 2020-12-03 16:13

    The solution provided by GGAnderson did return an error message. Using it inside mutate() however worked fine.

    df <- structure(list(A = c(56L, NA, NA, 67L, NA),
                         B = c(75L, 45L, 77L, 41L, 65L),
                         Year = c(1921L, 1921L, 1922L, 1923L, 1923L)),
                    .Names = c("A", "B", "Year"), 
                    class = "data.frame", 
                    row.names = c(NA, -5L))
    df
    df%>% 
      coalesce(A,B) #returns error
    
    df %>%
    mutate(A = coalesce(A,B)) #works
    

    (I am new to Stackoverflow; My low reputation does not allow to comment on GGAnderson´s answer directly)

    0 讨论(0)
提交回复
热议问题