How to find a good/optimal dictionary for zlib 'setDictionary' when processing a given set of data?
问题 I have a (huge) set of similar data files. The set is constantly growing. The size of a single file is about 10K. Each file must be compressed on its own. The compression is done with the zlib library, which is used by the java.util.zip.Deflater class. When passing a dictionary to the Deflate algorithm using setDictionary , I can improve the compression ratio. Is there a way (algorithm) to find the 'optimal' dictionary, i.e. a dictionary with the overall optimal compression ratio? See zlib