问题
Seems this would not be a deterministic thing, or is there a way to do this reliably?
回答1:
If you're using gzip, you can do something like this:
# diff <(zcat file1.gz) <(zcat file2.gz)
回答2:
Reliable: unzip both, diff.
I have no idea if that answer's good enough for your use, but it works.
回答3:
In general, you cannot avoid decompressing and then comparing. Different compressors will result in different DEFLATEd byte streams, which when INFLATEd result in the same original text. You cannot simply compare the DEFLATEd data, one to another. That will FAIL in some cases.
But in a ZIP scenario, there is a CRC32 calculated and stored for each entry. So if you want to check files, you can simply compare the stored CRC32 associated to each DEFLATEd stream, with the caveats on the uniqueness properties of the CRC32 hash. It may fit your needs to compare the FileName and the CRC.
You would need a ZIP library that reads zip files and exposes those things as properties on the "ZipEntry" object. DotNetZip will do that for .NET apps.
回答4:
zipcmp compares the zip archives zip1 and zip2 and checks if they contain the same files, comparing their names, uncompressed sizes, and CRCs. File order and compressed size differences are ignored.
sudo apt-get install zipcmp
回答5:
This isn't particularly elegant, but you can use the FileMerge application that comes with Mac OS X developer tools to compare the contents of zip files using a custom filter.
Create a script ~/bin/zip_filemerge_filter.bash
with contents:
#!/bin/bash
##
# List the size, CR-32 checksum, and file path of each file in a zip archive,
# sorted in order by file path.
##
unzip -v -l "${1}" | cut -c 1-9,59-,49-57 | sort -k3
exit $?
Make the script executable (chmod +x ~/bin/zip_filemerge_filter.bash
).
Open FileMerge, open the Preferences, and go to the "Filters" tab. Add an item to the list with: Extension:"zip", Filter:"~/bin/zip_filemerge_filter.bash $(FILE)", Display: Filtered, Apply*: No. (I've also added the filer for .jar and .war files.)
Then use FileMerge (or the command line "opendiff" wrapper) to compare two .zip files.
This won't let you diff the contents of files within the zip archives, but will let you quickly see which files appear within one only archive and which files exist in both but have different content (i.e. different size and/or checksum).
回答6:
Beyond compare has no problem with this.
回答7:
Actually gzip and bzip2 both come with dedicated tools for doing that.
With gzip:
$ zdiff file1.gz file2.gz
With bzip2:
$ bzdiff file1.bz2 file2.bz2
But keep in mind that for very large files, you might run into memory issues (I originally came here to find out about how to solve them, so I don't have the answer yet).
回答8:
A python solution for zip files:
import difflib
import zipfile
def diff(filename1, filename2):
differs = False
z1 = zipfile.ZipFile(open(filename1))
z2 = zipfile.ZipFile(open(filename2))
if len(z1.infolist()) != len(z2.infolist()):
print "number of archive elements differ: {} in {} vs {} in {}".format(
len(z1.infolist()), z1.filename, len(z2.infolist()), z2.filename)
return 1
for zipentry in z1.infolist():
if zipentry.filename not in z2.namelist():
print "no file named {} found in {}".format(zipentry.filename,
z2.filename)
differs = True
else:
diff = difflib.ndiff(z1.open(zipentry.filename),
z2.open(zipentry.filename))
delta = ''.join(x[2:] for x in diff
if x.startswith('- ') or x.startswith('+ '))
if delta:
differs = True
print "content for {} differs:\n{}".format(
zipentry.filename, delta)
if not differs:
print "all files are the same"
return 0
return 1
Use as
diff(filename1, filename2)
It compares files line-by-line in memory and shows changes.
回答9:
WinMerge (windows only) has lots of features and one of them is:
- Archive file support using 7-Zip
回答10:
I found relief with this simple Perl script: diffzips.pl
It recursively diffs every zip file inside the original zip, which is especially useful for different Java package formats: jar, war, and ear.
zipcmp uses more simple approach and it doesn't recurse into archived zips.
回答11:
I generally use an approach like @mrabbit's but run 2 unzip commands and diff the output as required. For example I need to compare 2 Java WAR files.
$ sdiff --width 160 \
<(unzip -l -v my_num1.war | cut -c 1-9,59-,49-57 | sort -k3) \
<(unzip -l -v my_num2.war | cut -c 1-9,59-,49-57 | sort -k3)
Resulting in output like so:
-------- ------- -------- -------
Archive: Archive:
-------- -------- ---- -------- -------- ----
48619281 130 files | 51043693 130 files
1116 060ccc56 index.jsp 1116 060ccc56 index.jsp
0 00000000 META-INF/ 0 00000000 META-INF/
155 b50f41aa META-INF/MANIFEST.MF | 155 701f1623 META-INF/MANIFEST.MF
Length CRC-32 Name Length CRC-32 Name
1179 b42096f1 version.jsp 1179 b42096f1 version.jsp
0 00000000 WEB-INF/ 0 00000000 WEB-INF/
0 00000000 WEB-INF/classes/ 0 00000000 WEB-INF/classes/
0 00000000 WEB-INF/classes/com/ 0 00000000 WEB-INF/classes/com/
...
...
回答12:
I gave up trying to use existing tools and wrote a little bash script that works for me:
#!/bin/bash
# Author: Onno Benschop, onno@itmaze.com.au
# Note: This requires enough space for both archives to be extracted in the tempdir
if [ $# -ne 2 ] ; then
echo Usage: $(basename "$0") zip1 zip2
exit
fi
# Make temporary directories
archive_1=$(mktemp -d)
archive_2=$(mktemp -d)
# Unzip the archives
unzip -qqd"${archive_1}" "$1"
unzip -qqd"${archive_2}" "$2"
# Compare them
diff -r "${archive_1}" "${archive_2}"
# Remove the temporary directories
rm -rf "${archive_1}" "${archive_2}"
来源:https://stackoverflow.com/questions/587442/is-there-a-safe-way-to-run-a-diff-on-two-zip-compressed-files