How to use apply or sapply or lapply with ffdf?

China☆狼群 提交于 2019-11-29 23:55:27

问题


Is there a way to use an apply type construct directly to the columns of a ffdf object? I am trying to count the NAs in each column without having to turn it into a standard data frame. I can get the na count for the individual columns using:

sum(is.na(ffdf$columnname))

But is there a way to do this for all the columns in the dataframe at once, something like:

lapply(ffdf, function(x){sum(is.na(x))})

When I run this I get:

$virtual
[1] 0

$physical
[1] 0

$row.names
[1] 0

I have not been able to find a special version of lapply or sapply in the ff documentation. Further is there a simple way to count the NAs over the entire ffdf in one go?


回答1:


An ffdf is basically a list with elements "virtual", "physical", "row.names". If you do an lapply over the physical element, you have what you want.

require(ffbase)
myffdf <- as.ffdf(iris)
lapply(physical(myffdf), FUN=function(x) sum(is.na(x)))

As is.na and sum is generic, this will basically use is.na.ff and sum.ff from package ffbase such that data is loaded into RAM chunkwise according to what your computer can handle.



来源:https://stackoverflow.com/questions/21885561/how-to-use-apply-or-sapply-or-lapply-with-ffdf

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!