How can I create a tree for Huffman encoding and decoding?

前端 未结 5 2070
轮回少年
轮回少年 2021-02-05 21:04

For my assignment, I am to do a encode and decode for huffman trees. I have a problem creating my tree, and I am stuck.

Don\'t mind the print statements - they are just

5条回答
  •  没有蜡笔的小新
    2021-02-05 21:24

    The Huffman algorithm in Wikipedia tells you exactly how to create the node tree, so your program can be based on that algorithm, or another like it. Here is a Python program with comments showing the corresponding wikipedia algorithm step. The test data is frequencies of the letters of the alphabet in English text.

    Once the node tree is created, you need to walk it down to assign Huffman codes to each symbol in your dataset. Since this is homework, that step is up to you, but a recursive algorithm is the simplest and most natural way to handle it. It's only six more lines of code.

    import queue
    
    class HuffmanNode(object):
        def __init__(self, left=None, right=None, root=None):
            self.left = left
            self.right = right
            self.root = root     # Why?  Not needed for anything.
        def children(self):
            return((self.left, self.right))
    
    freq = [
        (8.167, 'a'), (1.492, 'b'), (2.782, 'c'), (4.253, 'd'),
        (12.702, 'e'),(2.228, 'f'), (2.015, 'g'), (6.094, 'h'),
        (6.966, 'i'), (0.153, 'j'), (0.747, 'k'), (4.025, 'l'),
        (2.406, 'm'), (6.749, 'n'), (7.507, 'o'), (1.929, 'p'), 
        (0.095, 'q'), (5.987, 'r'), (6.327, 's'), (9.056, 't'), 
        (2.758, 'u'), (1.037, 'v'), (2.365, 'w'), (0.150, 'x'),
        (1.974, 'y'), (0.074, 'z') ]
    
    def create_tree(frequencies):
        p = queue.PriorityQueue()
        for value in frequencies:    # 1. Create a leaf node for each symbol
            p.put(value)             #    and add it to the priority queue
        while p.qsize() > 1:         # 2. While there is more than one node
            l, r = p.get(), p.get()  # 2a. remove two highest nodes
            node = HuffmanNode(l, r) # 2b. create internal node with children
            p.put((l[0]+r[0], node)) # 2c. add new node to queue      
        return p.get()               # 3. tree is complete - return root node
    
    node = create_tree(freq)
    print(node)
    
    # Recursively walk the tree down to the leaves,
    #   assigning a code value to each symbol
    def walk_tree(node, prefix="", code={}):
        return(code)
    
    code = walk_tree(node)
    for i in sorted(freq, reverse=True):
        print(i[1], '{:6.2f}'.format(i[0]), code[i[1]])
    

    When run on the alphabet data, the resulting Huffman codes are:

    e  12.70 100
    t   9.06 000
    a   8.17 1110
    o   7.51 1101
    i   6.97 1011
    n   6.75 1010
    s   6.33 0111
    h   6.09 0110
    r   5.99 0101
    d   4.25 11111
    l   4.03 11110
    c   2.78 01001
    u   2.76 01000
    m   2.41 00111
    w   2.37 00110
    f   2.23 00100
    g   2.02 110011
    y   1.97 110010
    p   1.93 110001
    b   1.49 110000
    v   1.04 001010
    k   0.75 0010111
    j   0.15 001011011
    x   0.15 001011010
    q   0.10 001011001
    z   0.07 001011000
    

提交回复
热议问题