Comparing two directories with subdirectories to find any changes?

折月煮酒 提交于 2019-12-11 05:29:59

问题


For starters I've only been playing with python for about a 2 weeks now and im relatively new to its proccessess, I'm trying to create a script that compares two directories with subdirectories and prints out ANY changes. I've read articles on hear about using os.walk to walk the directories and I've managed to write the script that prints all the files in a directory and its subdirectories in a understandable manner. I've also read on here and learned how to compare two directories but it only compares 1 file deep.

import os
x = 'D:\\xfiles'
y = 'D:\\yfiles'
q= [ filename for filename in x if filename not in y ]
print q 

Obviously that does not do what I want it to. This however is listing all files and all directories.

import os
x = 'D:\\xfiles'
x1 = os.walk(x)
for dirName, subdirList, fileList in x1:
     print ('Directory: %s' % dirName)
     for fname in fileList:
     print ('\%s' % fname)

How do I combine them and get it to work?


回答1:


I guess that best way to go will be external programs, as @Robᵩ suggests in the comment.

Using Python I would recommend doing following:

import os

def fileIsSame(right, left, path):
    return os.path.exists (os.path.join(left, path.replace(right, '')));

def compare(right, left):
    difference = list();
    for root, dirs, files in os.walk(right):
        for name in files:
            path = os.path.join(root, name);
            # check if file is same
            if fileIsSame(right, left, path):
                if os.path.isdir(path):
                    # recursively check subdirs
                    difference.extend(compare(path, left));
            else:
                # count file as difference
                difference.append(path);

    return difference;

This approach lacks normal fileIsSame function that would make sure files are same by content or by date modified and be sure to handle paths correctly (as I'm not sure this variant will). This algorithm requres you to specify full paths.

Usage example:

print (compare(r'c:\test', r'd:\copy_of_test'));

If second folder is copy of first, all the differences in paths (different disk letter and foldername) is ignored. Output will be [].




回答2:


Write a function to aggregate your listing.

import os

def listfiles(path):
    files = []
    for dirName, subdirList, fileList in os.walk(path):
        dir = dirName.replace(path, '')
        for fname in fileList:
            files.append(os.path.join(dir, fname))
    return files

x = listfiles('D:\\xfiles')
y = listfiles('D:\\yfiles')

You could use a list comprehension to extract the files that are not in both directories.

q = [filename for filename in x if filename not in y]

But using sets is much more efficient and flexible.

files_only_in_x = set(x) - set(y) 
files_only_in_y = set(y) - set(x)
files_only_in_either = set(x) ^ set(y)
files_in_both = set(x) & set(y)
all_files = set(x) | set(y)



回答3:


import os

def ls(path):
    all = []
    walked = os.walk(path)
    for base, sub_f, files in walked:           
        for sub in sub_f:           
            entry = os.path.join(base,sub)
            entry = entry[len(path):].strip("\\")
            all.append(entry)

        for file in files:          
            entry = os.path.join(base,file)
            entry = entry[len(path):].strip("\\")
            all.append(entry)
    all.sort()
    return all

def folder_diff(folder1_path, folder2_path):
    folder1_list = ls(folder1_path);
    folder2_list = ls(folder2_path);
    diff = [item for item in folder1_list if item not in folder2_list]
    diff.extend([item for item in folder2_list if item not in folder1_list])
    return diff


来源:https://stackoverflow.com/questions/19251993/comparing-two-directories-with-subdirectories-to-find-any-changes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!