Python - Compress Ascii String

后端 未结 2 920
面向向阳花
面向向阳花 2020-12-31 03:01

I\'m looking for a way to compress an ascii-based string, any help?

I also need to decompress it. I tried zlib but with no help.

What can I do to compress th

2条回答
  •  感情败类
    2020-12-31 03:45

    Using compression will not always reduce the length of a string!

    Consider the following code;

    import zlib
    import bz2
    
    def comptest(s):
        print 'original length:', len(s)
        print 'zlib compressed length:', len(zlib.compress(s))
        print 'bz2 compressed length:', len(bz2.compress(s))
    

    Let's try this on an empty string;

    In [15]: comptest('')
    original length: 0
    zlib compressed length: 8
    bz2 compressed length: 14
    

    So zlib produces an extra 8 characters, and bz2 14. Compression methods usually put a 'header' in front of the compressed data for use by the decompression program. This header increases the length of the output.

    Let's test a single word;

    In [16]: comptest('test')
    original length: 4
    zlib compressed length: 12
    bz2 compressed length: 40
    

    Even if you would substract the length of the header, the compression hasn't made the word shorter at all. That is because in this case there is little to compress. Most of the characters in the string occur only once. Now for a short sentence;

    In [17]: comptest('This is a compression test of a short sentence.')
    original length: 47
    zlib compressed length: 52
    bz2 compressed length: 73
    

    Again the compression output is larger than the input text. Due to the limited length of the text, there is little repetition in it, so it won't compress well.

    You need a fairly long block of text for compression to actually work;

    In [22]: rings = '''
       ....:     Three Rings for the Elven-kings under the sky, 
       ....:     Seven for the Dwarf-lords in their halls of stone, 
       ....:     Nine for Mortal Men doomed to die, 
       ....:     One for the Dark Lord on his dark throne 
       ....:     In the Land of Mordor where the Shadows lie. 
       ....:     One Ring to rule them all, One Ring to find them, 
       ....:     One Ring to bring them all and in the darkness bind them 
       ....:     In the Land of Mordor where the Shadows lie.'''
    
    In [23]: comptest(rings)                       
    original length: 410
    zlib compressed length: 205
    bz2 compressed length: 248
    

提交回复
热议问题