How can I apply ffdf to non-atomic data frames?

爱⌒轻易说出口 提交于 2019-12-14 03:44:43

问题


Many posts (such as this) claim the ff package is superior to bigmemory because it can handle objects w/ atomic and nonatomic components, but how? For example:

UNIT <- c(100,100, 200, 200, 200, 200, 200, 300, 300, 300,300)
STATUS <- c('ACTIVE','INACTIVE','ACTIVE','ACTIVE','INACTIVE','ACTIVE','INACTIVE','ACTIVE',
        'ACTIVE','ACTIVE','INACTIVE') 
TERMINATED <- as.Date(c('1999-07-06','2008-12-05','2000-08-18','2000-08-18','2000-08-18',
                    '2008-08-18','2008-08-18','2006-09-19','2006-09-19','2006-09-19',
                    '1999-03-15')) 
START <- as.Date(c('2007-04-23','2008-12-06','2004-06-01','2007-02-01','2008-04-19',
               '2010-11-29','2010-12-30','2007-10-29','2008-02-05','2008-06-30',
               '2009-02-07'))
STOP <- as.Date(c('2008-12-05','2012-12-31','2007-01-31','2008-04-18','2010-11-28',
              '2010-12-29','2012-12-31','2008-02-04','2008-06-29','2009-02-06',
              '2012-12-31'))
TEST <- data.frame(UNIT,STATUS,TERMINATED,START,STOP)
TEST                   

#install.packages('ff')            
library('ff')            
TEST2 <- ffdf(TEST)            
Error in ffdf(TEST) : ffdf components must be atomic ff objects

What can I do to make this work?


回答1:


Using

TEST2 <- as.ffdf(TEST)   

instead of

TEST2 <- ffdf(TEST)   

will work.

Explanation: as.ffdf converts your data.frame to an ffdf. If you really want to use ffdf directly, you need to supply atomic ff vectors as the error message indicates. For the above example this would be

ffdf(UNIT = as.ff(UNIT), STATUS = as.ff(as.factor(STATUS)), TERMINATED = as.ff(TERMINATED), START = as.ff(START), STOP = as.ff(STOP))

See ?as.ffdf or ?ffdf, part of the ff package.

In real life, your data would be coming from other sources like csv or SQL sources instead of from a data.frame already in R. See package ETLUtils to get your data from SQL into ff easily.




回答2:


I tried to coerce the columns of TEST data.frame to ff objects before the call to ffdf but this don't work. Here a workaround using read.csv.ffdf:

write.csv(TEST,file='test.csv')
TEST.ffd <- read.csv.ffdf(file='test.csv')


来源:https://stackoverflow.com/questions/15787221/how-can-i-apply-ffdf-to-non-atomic-data-frames

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!