问题
dt = data.table(x = c(1,1,2,2,2,2,3,3,3,3))
dt[, y := if(.N > 2) .N else NA, by = x] # fail
dt[, y := if(.N > 2) .N else NA_integer_, by = x] # good
This first grouping fails because NA
has a type and it's not integer. Is there a way to tell data table to ignore that and try to make all NAs to whatever type that keeps consistency?
I can manually set NA_integer
here, but if I have lots of columns of different types, it's hard to set all NA type correct.
BTW, what NA type should I use for Date/IDate/ITime?
回答1:
OP's first question: Is there a way to tell data table to ignore that and try to make all NAs to whatever type that keeps consistency?
No. You'll see a similar error without the assignment:
dt[, if(.N > 2) .N else NA, by = x]
# Error in `[.data.table`(dt, , if (.N > 2) .N else NA, by = x) :
# Column 1 of result for group 2 is type 'integer' but expecting type 'logical'. Column types must be consistent for each group.
In my opinion, this "Column types must be consistent for each group." message should be shown for your case as well.
OP's second question: BTW, what NA type should I use for Date/IDate/ITime?
For IDate et al, I always subset by NA_integer_
, which seems to give a length-one NA slice, e.g., as.IDate(Sys.Date())[NA_integer_]
. I don't know if that's what one should do, but I don't know of a better idea. An illustration:
z = IDateTime(factor(Sys.time()))
# idate itime
# 1: 2016-08-01 16:05:25
str( lapply(z, function(x) x[NA_integer_]) )
# List of 2
# $ idate: IDate[1:1], format: NA
# $ itime:Class 'ITime' int NA
来源:https://stackoverflow.com/questions/38703518/r-data-table-na-type-consistency