Threadsafe and fault-tolerant file writes

后端 未结 4 1795
傲寒
傲寒 2020-12-06 06:57

I have a long-running process which writes a lot of stuff in a file. The result should be everything or nothing, so I\'m writing to a temporary file and rename it to the rea

4条回答
  •  眼角桃花
    2020-12-06 07:13

    To write all or nothing to a file reliably:

    import os
    from contextlib import contextmanager
    from tempfile   import NamedTemporaryFile
    
    if not hasattr(os, 'replace'):
        os.replace = os.rename #NOTE: it won't work for existing files on Windows
    
    @contextmanager
    def FaultTolerantFile(name):
        dirpath, filename = os.path.split(name)
        # use the same dir for os.rename() to work
        with NamedTemporaryFile(dir=dirpath, prefix=filename, suffix='.tmp') as f:
            yield f
            f.flush()   # libc -> OS
            os.fsync(f) # OS -> disc (note: on OSX it is not enough)
            f.delete = False # don't delete tmp file if `replace()` fails
            f.close()
            os.replace(f.name, name)
    

    See also Is rename() without fsync() safe? (mentioned by @Mihai Stan)

    Usage

    with FaultTolerantFile('very_important_file') as file:
        file.write('either all ')
        file.write('or nothing is written')
    

    To implement missing os.replace() you could call MoveFileExW(src, dst, MOVEFILE_REPLACE_EXISTING) (via win32file or ctypes modules) on Windows.

    In case of multiple threads you could call queue.put(data) from different threads and write to file in a dedicated thread:

     for data in iter(queue.get, None):
         file.write(data)
    

    queue.put(None) breaks the loop.

    As an alternative you could use locks (threading, multiprocessing, filelock) to synchronize access:

    def write(self, data):
        with self.lock:
            self.file.write(data)
    

提交回复
热议问题