Preferred block size when reading/writing big binary files

↘锁芯ラ 提交于 2019-12-05 05:53:17

Let the OS make the decision for you. Use the mmap module:

https://docs.python.org/3.4/library/mmap.html

It uses your OS's underlying memory mapping mechanism for mapping the contents of a file into RAM.

Be aware that there's a 2GB file size limit if you're using 32-bit Python, so be sure to use the 64-bit version if you decide to go this route.

For example:

f1 = open('input_file', 'r+b')
m1 = mmap.mmap(f1.fileno(), 0)
f2 = open('out_file', 'a+b') # out_file must be >0 bytes on windows
m2 = mmap.mmap(f2.fileno(), 0)
m2.resize(len(m1))
m2[:] = m1 # copy input_file to out_file
m2.flush() # flush results

Note that you never had to call any read() functions and decide how many bytes to bring into RAM. This example just copies one file into another, but as you said in your example, you can do whatever processing you need in between. Note that while the entire file is mapped to an address space in RAM, that doesn't mean it has actually been copied there. It will be copied piecewise, at the discretion of the OS.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!