How to compute good preset dictionary for deflate compression

穿精又带淫゛_ 提交于 2019-12-05 19:04:01

I am not aware of an algorithm to generate an optimal or even a good dictionary. This is generally done by hand. I think that a suffix tree would be a good approach to finding common strings for a dictionary, but I have never tried it.

The first thing to try is to simply concatenate 32K worth of your 1-3K examples and see how much gain that provides over no dictionary. Then you mess with it from there, changing the ordering of examples or pulling out repeated pieces in the examples to the end of the dictionary.

Note that the most common strings should be put at the end, since shorter distances take fewer bits.

I don't know how good this is, but it's a dictionary creator: https://github.com/vkrasnov/dictator

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!