问题
I would like to convert a data.frame to a ff object, with as.ffdf as described here
df.apr=as.data.frame(df.apr) # from data.table to data.frame
cols=df.apr[1,]
cols=sapply(cols,class)
df_apr=as.ffdf(df.apr,vmode=cols)
gives an error:
Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered,
: vmode 'numeric' not implemented
without the 'vmode' argument, the following error is given:
Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered,
: vmode 'character' not implemented
writing away to a table and then reading directly into ff works however:
write.table(df.apr,file='df_apr.txt',sep='\t',row.names=F)
df.apr.ff=read.table.ffdf(file='df_apr.txt',header=F,VERBOSE=T)
but this is time consuming [and clumsy]. is there a better way?
回答1:
If you want to know all possible vmodes which can be used in ff type the following at the console.
require(ff)
.vimplemented
You'll see that numeric and character modes are not in these. Numerics are converted to doubles, characters to factors. So in your question, you really don't need to specify the vmodes yourself. As long as the characters are coded as factors, you can use as.ffdf on your data.frame. So this will work.
df.apr=as.data.frame(df.apr, stringsAsFactors=TRUE)
df_apr=as.ffdf(df.apr)
FYI. If your data is coming from flat files, consider using read.table.ffdf or if it is coming from an SQL data source, you can used read.dbi.ffdf or read.odbc.ffdf from the ETLUtils package. If it is coming from Hadoop through Hive, you can use read.jdbc.ffdf from the ETLUtils package.
来源:https://stackoverflow.com/questions/17251064/convert-data-frame-to-ff