Using `:=` in data.table to sum the values of two columns in R, ignoring NAs

前端 未结 2 500
耶瑟儿~
耶瑟儿~ 2020-12-05 10:41

I have what I think is a very simple question related to the use of data.table and the := function. I don\'t think I quite understand the behaviour of :=<

2条回答
  •  无人及你
    2020-12-05 11:28

    This is standard R behaviour, nothing really to do with data.table

    Adding anything to NA will return NA

    NA + 1
    ## NA
    

    sum will return a single number

    If you want 1 + NA to return 1

    then you will have to run something like

    mat[,col3 := col1 + col2]
    mat[is.na(col1), col3 := col2]
    mat[is.na(col2), col3 := col1]
    

    To deal with when col1 or col2 are NA


    EDIT - an easier solution

    You could also use rowSums, which has a na.rm argument

    mat[ , col3 :=rowSums(.SD, na.rm = TRUE), .SDcols = c("col1", "col2")]
    

    rowSums is what you want (by definition, the rowSums of a matrix containing col1 and col2, removing NA values

    (@JoshuaUlrich suggested this as a comment )

提交回复
热议问题