问题
I am writing a house-keeping script and have files within a directory that I want to clean up. I want to move files from a source directory to another, there are many sub-directories so there could be files that are the same. What I want to do, is either use CMP command or MD5sum each file, if they are no duplicates then move them, if they are the same only move 1.
So the I have the move part working correctly as follows:
find /path/to/source -name "IMAGE_*.JPG" -exec mv '{}' /path/to/destination \;
I am assuming that I will have to loop through my directory, so I am thinking.
for files in /path/to/source do if -name "IMAGE_*.JPG" then md5sum (or cmp) $files ...stuck here (I am worried about how this method will be able to compare all the files against eachother and how I would filter them out)... then just do the mv to finish.
Thanks in advance.
回答1:
find . -type f -exec md5sum {} \; | sort | uniq -d
That'll spit out all the md5 hashes that have duplicates. then it's just a matter of figuring out which file(s) produced those duplicate hashes.
回答2:
There's a tool designed for this purpose, it's fdupes :
fdupes -r dir/
回答3:
dupmerge is another such tool...
来源:https://stackoverflow.com/questions/20131492/bash-checking-if-files-are-duplicates-within-a-directory