In R usually data is loaded in RAM. Are there any packages which load the data in disk rather than RAM
Check out the bigmemory
package, along with related packages like bigtabulate
, bigalgebra
, biganalytics
, and more. There's also ff
, though I don't find it as user-friendly as the bigmemory
suite. The bigmemory
suite was reportedly partially motivated by the difficulty of using ff
. I like it because it required very few changes to my code to be able to access a bigmatrix
object: it can be manipulated in almost exactly the same ways as a standard matrix, so my code is very reusable.
There's also support for HDF5 via NetCDF4, in packages like RNetCDF
and ncdf
. This is a popular, multi-platform, multi-language method for efficient storage and access of large data sets.
If you want basic memory mapping functionality, look at the mmap
package.