Speed up RData load

后端 未结 3 1217
清酒与你
清酒与你 2020-12-23 21:15

I\'ve checked several related questions such is this

How to load data quickly into R?

I\'m quoting specific part of the most rated answer

3条回答
  •  感动是毒
    2020-12-23 21:40

    save compresses by default, so it takes extra time to uncompress the file. Then it takes a bit longer to load the larger file into memory. Your pv example is just copying the compressed data to memory, which isn't very useful to you. ;-)

    UPDATE:

    I tested my theory and it was incorrect (at least on my Windows XP machine with 3.3Ghz CPU and 7200RPM HDD). Loading compressed files is faster (probably because it reduces disk I/O).

    The extra time is spent in RestoreToEnv (in saveload.c) and/or R_Unserialize (in serialize.c). So you could make loading faster by changing those files, or maybe by using saveRDS to individually save the objects in myGraph.RData then somehow using loadRDS across multiple R processes to load the data into shared memory...

提交回复
热议问题