I\'ve heard discussion about how OpenOffice (ODF) files are compressed zip files of XML and other data. So making a tiny change to the file can potentially totally change th
I've modified the python program in Craig McQueen's answer just a bit. Changes include:
Actually checking the return of testZip (according to the docs, it appears that the original program will happily proceed with a corrupt zip file past the checkzip step).
Rewrite the for-loop to check for already-uncompressed files to be a single if-statement.
Here is the new program:
#!/usr/bin/python
# Note, written for Python 2.6
import sys
import shutil
import zipfile
# Get a single command-line argument containing filename
commandlineFileName = sys.argv[1]
backupFileName = commandlineFileName + ".bak"
inFileName = backupFileName
outFileName = commandlineFileName
checkFilename = commandlineFileName
# Check input file
# First, check it is valid (not corrupted)
checkZipFile = zipfile.ZipFile(checkFilename)
if checkZipFile.testzip() is not None:
raise Exception("Zip file is corrupted")
# Second, check that it's not already uncompressed
if all(f.compress_type==zipfile.ZIP_STORED for f in checkZipFile.infolist()):
raise Exception("File is already uncompressed")
checkZipFile.close()
# Copy to "backup" file and use that as the input
shutil.copy(commandlineFileName, backupFileName)
inputZipFile = zipfile.ZipFile(inFileName)
outputZipFile = zipfile.ZipFile(outFileName, "w", zipfile.ZIP_STORED)
# Copy each input file's data to output, making sure it's uncompressed
for fileObject in inputZipFile.infolist():
fileData = inputZipFile.read(fileObject)
outFileObject = fileObject
outFileObject.compress_type = zipfile.ZIP_STORED
outputZipFile.writestr(outFileObject, fileData)
outputZipFile.close()