问题
We have 20 files named as file*.txt
all in one directory:
file1.txt
file2.txt
...
file20.txt
In the same directory we have other files too, which we need to ignore:
someotherfile.csv
somemore.txt
etc.pdf
Need to find out if the contents of the files are the same. Tried to use diff
, obviously failed:
diff -r ./file*.txt ./file*.txt`
回答1:
If you just want a quick visual "are the same" answer, I'd use;
md5sum file*.txt
回答2:
A relatively simple one-liner might suffice:
Tested on OSX,
md5 -q file*.txt | sort -u
If you see more than one line as output, the files are not the same
回答3:
If you are just comparing two files, then try:
diff "$source_file" "$dest_file" # without -q
or
cmp "$source_file" "$dest_file" # without -s
in order to see the supposed differences.
You can also try md5sum:
md5sum "$source_file" "$dest_file"
If any suggestions please do reply...!
回答4:
Put this script in the directory which has file*.txt
and run
#!/bin/bash
FILES=./file*.txt
for filename in $FILES; do
for other in $FILES; do
if [ "$filename" != "$other" ]
then
cmp -s $filename $other
retval=$?
if [ $retval -eq 0 ]
then
echo "$filename $other are same"
fi
fi
done
done
And it will print both file1.txt file3.txt are same
and file3.txt file1.txt are same
. You can figure out how to avoid that.
回答5:
Linux seems to have a different set of tools on board than OSX. The above (md5) looks nice, but doesn't work, as md5
is md5sum
and returns the file name of the checked file on each line.
My version on RH linux:
Create equal files first:
for i in `seq -w 1 20` ; do echo one > test${i}.txt ; done
Then run this:
md5sum *.txt | cut -d ' ' -f 1 | sort -u
With a | wc -l
or something, you could find the number of lines. I'd personally go this way.
来源:https://stackoverflow.com/questions/29670259/compare-files-with-each-other-within-the-same-directory