From what I observe filecmp.dircmp is recursive, but inadequate for my needs, at least in py2. I want to compare two directories and all their contained files. Do
def same(dir1, dir2):
"""Returns True if recursively identical, False otherwise
"""
c = filecmp.dircmp(dir1, dir2)
if c.left_only or c.right_only or c.diff_files or c.funny_files:
return False
else:
safe_so_far = True
for i in c.common_dirs:
same_so_far = same_so_far and same(os.path.join(frompath, i), os.path.join(topath, i))
if not same_so_far:
break
return same_so_far
Based on python issue 12932 and filecmp documentation you may use following example:
import os
import filecmp
# force content compare instead of os.stat attributes only comparison
filecmp.cmpfiles.__defaults__ = (False,)
def _is_same_helper(dircmp):
assert not dircmp.funny_files
if dircmp.left_only or dircmp.right_only or dircmp.diff_files or dircmp.funny_files:
return False
for sub_dircmp in dircmp.subdirs.values():
if not _is_same_helper(sub_dircmp):
return False
return True
def is_same(dir1, dir2):
"""
Recursively compare two directories
:param dir1: path to first directory
:param dir2: path to second directory
:return: True in case directories are the same, False otherwise
"""
if not os.path.isdir(dir1) or not os.path.isdir(dir2):
return False
dircmp = filecmp.dircmp(dir1, dir2)
return _is_same_helper(dircmp)
The report_full_closure()
method is recursive:
comparison = filecmp.dircmp('/directory1', '/directory2')
comparison.report_full_closure()
Edit: After the OP's edit, I would say that it's best to just use the other functions in filecmp
. I think os.walk
is unnecessary; better to simply recurse through the lists produced by common_dirs
, etc., although in some cases (large directory trees) this might risk a Max Recursion Depth error if implemented poorly.
Since a True or False result is all you want, if you have diff
installed:
def are_dir_trees_equal(dir1, dir2):
process = Popen(["diff", "-r", dir1, dir2], stdout=PIPE)
exit_code = process.wait()
return not exit_code
dircmp
can be recursive: see report_full_closure.
As far as I know dircmp
does not offer a directory comparison function. It would be very easy to write your own, though; use left_only
and right_only
on dircmp
to check that the files in the directories are the same and then recurse on the subdirs
attribute.
filecmp.dircmp
is the way to go. But it does not compare the content of files found with the same path in two compared directories. Instead filecmp.dircmp
only looks at files attributes. Since dircmp
is a class, you fix that with a dircmp
subclass and override its phase3
function that compares files to ensure content is compared instead of only comparing os.stat
attributes.
import filecmp
class dircmp(filecmp.dircmp):
"""
Compare the content of dir1 and dir2. In contrast with filecmp.dircmp, this
subclass compares the content of files with the same path.
"""
def phase3(self):
"""
Find out differences between common files.
Ensure we are using content comparison with shallow=False.
"""
fcomp = filecmp.cmpfiles(self.left, self.right, self.common_files,
shallow=False)
self.same_files, self.diff_files, self.funny_files = fcomp
Then you can use this to return a boolean:
import os.path
def is_same(dir1, dir2):
"""
Compare two directory trees content.
Return False if they differ, True is they are the same.
"""
compared = dircmp(dir1, dir2)
if (compared.left_only or compared.right_only or compared.diff_files
or compared.funny_files):
return False
for subdir in compared.common_dirs:
if not is_same(os.path.join(dir1, subdir), os.path.join(dir2, subdir)):
return False
return True
In case you want to reuse this code snippet, it is hereby dedicated to the Public Domain or the Creative Commons CC0 at your choice (in addition to the default license CC-BY-SA provided by SO).