Recursively compare two directories to ensure they have the same files and subdirectories

后端 未结 11 1303
猫巷女王i
猫巷女王i 2020-12-23 20:49

From what I observe filecmp.dircmp is recursive, but inadequate for my needs, at least in py2. I want to compare two directories and all their contained files. Do

11条回答
  •  佛祖请我去吃肉
    2020-12-23 21:44

    filecmp.dircmp is the way to go. But it does not compare the content of files found with the same path in two compared directories. Instead filecmp.dircmp only looks at files attributes. Since dircmp is a class, you fix that with a dircmp subclass and override its phase3 function that compares files to ensure content is compared instead of only comparing os.stat attributes.

    import filecmp
    
    class dircmp(filecmp.dircmp):
        """
        Compare the content of dir1 and dir2. In contrast with filecmp.dircmp, this
        subclass compares the content of files with the same path.
        """
        def phase3(self):
            """
            Find out differences between common files.
            Ensure we are using content comparison with shallow=False.
            """
            fcomp = filecmp.cmpfiles(self.left, self.right, self.common_files,
                                     shallow=False)
            self.same_files, self.diff_files, self.funny_files = fcomp
    

    Then you can use this to return a boolean:

    import os.path
    
    def is_same(dir1, dir2):
        """
        Compare two directory trees content.
        Return False if they differ, True is they are the same.
        """
        compared = dircmp(dir1, dir2)
        if (compared.left_only or compared.right_only or compared.diff_files 
            or compared.funny_files):
            return False
        for subdir in compared.common_dirs:
            if not is_same(os.path.join(dir1, subdir), os.path.join(dir2, subdir)):
                return False
        return True
    

    In case you want to reuse this code snippet, it is hereby dedicated to the Public Domain or the Creative Commons CC0 at your choice (in addition to the default license CC-BY-SA provided by SO).

提交回复
热议问题