Uncompress OpenOffice files for better storage in version control

后端未结

关注

 6  1358

天命终不由人 2020-12-15 07:00

I\'ve heard discussion about how OpenOffice (ODF) files are compressed zip files of XML and other data. So making a tiny change to the file can potentially totally change th

6条回答

盖世英雄少女心 (楼主)

2020-12-15 07:40

I've modified the python program in Craig McQueen's answer just a bit. Changes include:

Actually checking the return of testZip (according to the docs, it appears that the original program will happily proceed with a corrupt zip file past the checkzip step).
Rewrite the for-loop to check for already-uncompressed files to be a single if-statement.

Here is the new program:

#!/usr/bin/python
# Note, written for Python 2.6

import sys
import shutil
import zipfile

# Get a single command-line argument containing filename
commandlineFileName = sys.argv[1]

backupFileName = commandlineFileName + ".bak"
inFileName = backupFileName
outFileName = commandlineFileName
checkFilename = commandlineFileName

# Check input file
# First, check it is valid (not corrupted)
checkZipFile = zipfile.ZipFile(checkFilename)

if checkZipFile.testzip() is not None:
    raise Exception("Zip file is corrupted")

# Second, check that it's not already uncompressed
if all(f.compress_type==zipfile.ZIP_STORED for f in checkZipFile.infolist()):
    raise Exception("File is already uncompressed")

checkZipFile.close()

# Copy to "backup" file and use that as the input
shutil.copy(commandlineFileName, backupFileName)
inputZipFile = zipfile.ZipFile(inFileName)

outputZipFile = zipfile.ZipFile(outFileName, "w", zipfile.ZIP_STORED)

# Copy each input file's data to output, making sure it's uncompressed
for fileObject in inputZipFile.infolist():
    fileData = inputZipFile.read(fileObject)
    outFileObject = fileObject
    outFileObject.compress_type = zipfile.ZIP_STORED
    outputZipFile.writestr(outFileObject, fileData)

outputZipFile.close()

0 讨论(0)

查看其它6个回答