Delete file from zipfile with the ZipFile Module

自作多情 提交于 2019-11-26 09:33:04

问题


The only way I came up for deleting a file from a zipfile was to create a temporary zipfile without the file to be deleted and then rename it to the original filename.

In python 2.4 the ZipInfo class had an attribute file_offset, so it was possible to create a second zip file and copy the data to other file without decompress/recompressing.

This file_offset is missing in python 2.6, so is there another option than creating another zipfile by uncompressing every file and then recompressing it again?

Is there maybe a direct way of deleting a file in the zipfile, I searched and didn\'t find anything.


回答1:


The following snippet worked for me (deletes all *.exe files from a Zip archive):

zin = zipfile.ZipFile ('archive.zip', 'r')
zout = zipfile.ZipFile ('archve_new.zip', 'w')
for item in zin.infolist():
    buffer = zin.read(item.filename)
    if (item.filename[-4:] != '.exe'):
        zout.writestr(item, buffer)
zout.close()
zin.close()

If you read everything into memory, you can eliminate the need for a second file. However, this snippet recompresses everything.

After closer inspection the ZipInfo.header_offset is the offset from the file start. The name is misleading, but the main Zip header is actually stored at the end of the file. My hex editor confirms this.

So the problem you'll run into is the following: You need to delete the directory entry in the main header as well or it will point to a file that doesn't exist anymore. Leaving the main header intact might work if you keep the local header of the file you're deleting as well, but I'm not sure about that. How did you do it with the old module?

Without modifying the main header I get an error "missing X bytes in zipfile" when I open it. This might help you to find out how to modify the main header.




回答2:


Not very elegant but this is how I did it:

import subprocess
import zipfile

z = zipfile.ZipFile(zip_filename)

files_to_del = filter( lambda f: f.endswith('exe'), z.namelist()]

cmd=['zip', '-d', zip_filename] + files_to_del
subprocess.check_call(cmd)

# reload the modified archive
z = zipfile.ZipFile(zip_filename)



回答3:


The routine delete_from_zip_file from ruamel.std.zipfile¹ allows you to delete a file based on its full path within the ZIP, or based on (re) patterns. E.g. you can delete all of the .exe files from test.zip using

from ruamel.std.zipfile import delete_from_zip_file

delete_from_zip_file('test.zip', pattern='.*.exe')  

(please note the dot before the *).

This works similar to mdm's solution (including the need for recompression), but recreates the ZIP file in memory (using the class InMemZipFile()), overwriting the old file after it is fully read.


¹ Disclaimer: I am the author of that package.



来源:https://stackoverflow.com/questions/513788/delete-file-from-zipfile-with-the-zipfile-module

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!