rbindlist two data.tables where one has factor and other has character type for a column

后端 未结 3 827
醉话见心
醉话见心 2020-12-09 18:03

I just discovered this warning in my script that was a bit strange.

# Warning message:
# In rbindlist(list(DT.1, DT.2)) : NAs introduced by coercion
         


        
3条回答
  •  既然无缘
    2020-12-09 18:11

    rbindlist is superfast because it doesn't do the checking of rbindfill or do.call(rbind.data.frame,...)

    You can use a workaround like this to ensure that factors are coerced to characters.

    DT.1 <- data.table(x = factor(letters[1:5]), y = 6:10)
    DT.2 <- data.table(x = LETTERS[1:5], y = 11:15)
    
    
    for(ii in seq_along(DDL)){
      ff <- Filter(function(x) is.factor(DDL[[ii]][[x]]), names(DDL[[ii]]))
      for(fn in ff){
        set(DDL[[ii]], j = fn, value = as.character(DDL[[ii]][[fn]]))
        }
      }
     rbindlist(DDL)
    

    or (less memory efficiently)

    rbindlist(rapply(DDL, classes = 'factor', f= as.character, how = 'replace'))
    

提交回复
热议问题