Estimating zip size/creation time

巧了我就是萌 提交于 2019-12-05 05:16:38

Extract a bunch of small parts from the big file. Maybe 64 chunks of 64k each. Randomly selected.

Concatenate the data, compress it, measure the time and the compression ratio. Since you've randomly selected parts of the file chances are that you have compressed a representative subset of the data.

Now all you have to do is to estimate the time for the whole file based on the time of your test-data.

I suggest you measure the average time it takes to produce a zip of a certain size. Then you calculate the estimate from that measure. However I think the estimate will be very rough in any case if you don't know how well the data compresses. If the data you want to compress had a very similar "profile" each time you could probably make better predictions.

If its possible to get progress callbacks from the python module i would suggest finding out how many bytes are processed pr second ( By simply storing where in the file you where at start of the second, and where you are at the end ). When you have the data on how fast the computer your on you can off course save it, and use it as a basis for your next zip file. ( I normally collect about 5 samples before showing a time prognosses )

Using this method can give you Microsoft minutes so as you get more samples you would need to average it out. This would esp be the case if your making a zip file that contains a lot of files, as ZIP tends to slow down when compressing many small files compared to 1 large file.

If you're using the ZipFile.write() method to write your files into the archive, you could do the following:

  1. Get a list of the files you want to zip and their relative sizes
  2. Write one file to the archive and time how long it took
  3. Calculate ETA based on the number of files written, their size, and how much is left.

This won't work if you're only zipping one really big file though. I've never used the zip module myself, so I'm not sure if it would work, but for small numbers of large files, maybe you could use the ZipFile.writestr() function and read in / zip up your files in chunks?

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!