While analysing some data, I came across the warning message, which I suspect to be a bug as it is a pretty straightforward command that I have worked with many times.
UPDATE : Now fixed in v1.8.9 by Ricardo
o rbind'ing data.tables containing duplicate, "" or NA column names now works, #2726 & #2384. Thanks to Garrett See and Arun Srinivasan for reporting. This also affected the printing of data.tables with duplicate column names since the head and tail are rbind-ed together internally.
Yes, bug. Seems to be in the print method of data.table
s with duplicated names.
ans = dt[, list(sum(V2), mean(V3)),by=V1]
head(ans)
V1 V1 V2 # notice the duplicated V1
1: acgmqyuwpe 140 78.07692
2: adcltygwsq 191 76.93333
3: adftozibnh 153 73.82143
4: aeuowtlskr 122 73.04348
5: ahfoqclkpg 143 75.83333
6: ahtczyuipw 135 73.54167
tail(ans)
V1 V1 V2
1: zugrnehpmq 189 72.63889
2: zuqegoxkpi 150 76.03333
3: zwpserimgf 180 74.81818
4: zxkpdrlcsf 115 72.57895
5: zxvoaeflhq 157 76.53571
6: zyiwcsanlm 145 72.79167
print(ans)
Error in rbindlist(allargs) :
(converted from warning) NAs introduced by coercion
rbind(head(ans),tail(ans))
Error in rbindlist(allargs) :
(converted from warning) NAs introduced by coercion
As a work around, don't create data.table with column names V1
, V2
etc.
It's arising due to this known bug :
#2384 rbind of tables containing duplicate column names doesn't bind correctly
and I've added a link there back to this question.
Thanks!