I have a long-running process which writes a lot of stuff in a file. The result should be everything or nothing, so I\'m writing to a temporary file and rename it to the rea
To write all or nothing to a file reliably:
import os
from contextlib import contextmanager
from tempfile import NamedTemporaryFile
if not hasattr(os, 'replace'):
os.replace = os.rename #NOTE: it won't work for existing files on Windows
@contextmanager
def FaultTolerantFile(name):
dirpath, filename = os.path.split(name)
# use the same dir for os.rename() to work
with NamedTemporaryFile(dir=dirpath, prefix=filename, suffix='.tmp') as f:
yield f
f.flush() # libc -> OS
os.fsync(f) # OS -> disc (note: on OSX it is not enough)
f.delete = False # don't delete tmp file if `replace()` fails
f.close()
os.replace(f.name, name)
See also Is rename() without fsync() safe? (mentioned by @Mihai Stan)
with FaultTolerantFile('very_important_file') as file:
file.write('either all ')
file.write('or nothing is written')
To implement missing os.replace() you could call MoveFileExW(src, dst, MOVEFILE_REPLACE_EXISTING) (via win32file or ctypes modules) on Windows.
In case of multiple threads you could call queue.put(data) from
different threads and write to file in a dedicated thread:
for data in iter(queue.get, None):
file.write(data)
queue.put(None) breaks the loop.
As an alternative you could use locks (threading, multiprocessing, filelock) to synchronize access:
def write(self, data):
with self.lock:
self.file.write(data)