问题
My intuition was that writing a file line by line as possible in python, in the same way, that reading it is. I was wrong. Consider the following code.
with open("source_file.csv","r") as source, open("output_file.csv","w") as dest:
while 1:
buf = source.read(16*1024)
if not buf:
break
dest.write(buf)
I observed how the output file is written via a $ tail -f output_file.csv. Data is displayed as it is being written. I interrupted the python script and noticed that none of that data was written on disk.
Here's my understanding of this so far :
- The
readoperation is indeed done line-by-line here without loading it into memory. Thewriteoperation, on the other hand, is loaded line-by-line into memory, before it is serialized and written on disk. I thought data could be written on disk chunk-by-chunk, to be memory efficient. Otherwise, any big file could not be copied (memory will eventually be insufficient), which is (Probably) what happens in my previous post. My questions : - Is my analysis correct? If not, I would like additional information.
- How do we write chunk-by-chunk on disk?
来源:https://stackoverflow.com/questions/62681614/how-do-i-write-a-file-in-python-without-loading-it-into-memory