I have what I think is a very simple question related to the use of data.table and the :=
function. I don\'t think I quite understand the behaviour of :=<
This is standard R
behaviour, nothing really to do with data.table
Adding anything to NA
will return NA
NA + 1
## NA
sum
will return a single number
If you want 1 + NA
to return 1
then you will have to run something like
mat[,col3 := col1 + col2]
mat[is.na(col1), col3 := col2]
mat[is.na(col2), col3 := col1]
To deal with when col1
or col2
are NA
You could also use rowSums, which has a na.rm
argument
mat[ , col3 :=rowSums(.SD, na.rm = TRUE), .SDcols = c("col1", "col2")]
rowSums
is what you want (by definition, the rowSums
of a matrix containing col1
and col2
, removing NA
values
(@JoshuaUlrich suggested this as a comment )
It's not a lack of understanding of data.table but rather one regarding vectorized functions in R. You can define a dyadic operator that will behave differently than the "+" operator with regard to missing values:
`%+na%` <- function(x,y) {ifelse( is.na(x), y, ifelse( is.na(y), x, x+y) )}
mat[ , col3:= col1 %+na% col2]
#-------------------------------
col1 col2 col3
1: NA 0.003745 0.003745
2: 0.000000 0.007463 0.007463
3: -0.015038 -0.007407 -0.022445
4: 0.003817 -0.003731 0.000086
5: -0.011407 -0.007491 -0.018898
You can use mrdwad's comment to do it with sum(... , na.rm=TRUE
):
mat[ , col4 := sum(col1, col2, na.rm=TRUE), by=1:NROW(mat)]