How to create objects of arbitrary memory size?

后端 未结 1 939
遇见更好的自我
遇见更好的自我 2020-12-22 12:55

I\'m writing a hash function to create hashes of some given size (e.g. 20 bits).

I have learnt how to write the hashes to files in a binary form (see my related ques

相关标签:
1条回答
  • 2020-12-22 13:14

    You could do something like you want by packing the bits for each object into a packed array of bit (or boolean) values. There are a number of existing Python bitarray extension modules available. Implementing a higher level "array of fixed bit width integer values" with one is a relatively straight-forward process.

    Here's an example based on one in pypi that's implemented in C for speed. You can also download an unofficial pre-built Windows version of it, created by Christoph Gohlke, from here.

    Updated — Now works in Python 2.7 & 3.x.

    from __future__ import print_function
    # uses https://pypi.python.org/pypi/bitarray
    from bitarray import bitarray as BitArray
    try:
        from functools import reduce  # Python 3.
    except:
        pass
    
    
    class PackedIntArray(object):
        """ Packed array of unsigned fixed-bit-width integer values. """
        def __init__(self, array_size, item_bit_width, initializer=None):
            self.array_size = array_size
            self.item_bit_width = item_bit_width
            self.bitarray = BitArray(array_size * item_bit_width)
            if initializer is not None:
                try:
                    iter(initializer)
                except TypeError:  # not iterable
                    self.bitarray.setall(initializer) # set all to bool(initializer)
                else:
                    for i in xrange(array_size):
                        self[i] = initializer[i]  # must be same length as array
    
        def __getitem__(self, index):
            offset = index * self.item_bit_width
            bits = self.bitarray[offset: offset+self.item_bit_width]
            return reduce(lambda x, y: (x << 1) | y, bits, 0)
    
        def __setitem__(self, index, value):
            bits = BitArray('{:0{}b}'.format(value, self.item_bit_width))
            offset = index * self.item_bit_width
            self.bitarray[offset: offset+self.item_bit_width] = bits
    
        def __len__(self):
            """ Return the number of items stored in the packed array.. """
            return self.array_size
    
        def length(self):
            """ Return the number of bits stored in the bitarray.. """
            return self.bitarray.length()
    
        def __repr__(self):
            return('PackedIntArray({}, {}, ('.format(self.array_size,
                                                        self.item_bit_width) +
                   ', '.join((str(self[i]) for i in xrange(self.array_size))) +
                   '))')
    
    
    if __name__ == '__main__':
        from random import randrange
    
        # hash function configuration
        BW = 8, 8, 4  # bit widths of each integer
        HW = sum(BW)  # total hash bit width
    
        def myhash(a, b, c):
            return (((((a & (2**BW[0]-1)) << BW[1]) |
                        b & (2**BW[1]-1)) << BW[2]) |
                        c & (2**BW[2]-1))
    
        hashes = PackedIntArray(3, HW)
    
        print('hash bit width: {}'.format(HW))
        print('length of hashes array: {:,} bits'.format(hashes.length()))
        print()
        print('populate hashes array:')
        for i in range(len(hashes)):
            hashed = myhash(*(randrange(2**bit_width) for bit_width in BW))
            print('  hashes[{}] <- {:,} (0b{:0{}b})'.format(i, hashed, hashed, HW))
            hashes[i] = hashed
        print()
        print('contents of hashes array:')
        for i in range(len(hashes)):
            print(('  hashes[{}]: {:,} '
                    '(0b{:0{}b})'.format(i, hashes[i], hashes[i], HW)))
    

    Sample output:

    hash bit width: 20
    length of hashes array: 60 bits
    
    populate hashes array:
      hashes[0] <- 297,035 (0b01001000100001001011)
      hashes[1] <- 749,558 (0b10110110111111110110)
      hashes[2] <- 690,468 (0b10101000100100100100)
    
    contents of hashes array:
      hashes[0]: 297,035 (0b01001000100001001011)
      hashes[1]: 749,558 (0b10110110111111110110)
      hashes[2]: 690,468 (0b10101000100100100100)
    

    Note: bitarray.bitarray objects also have methods to write and read their bits to and from files. These could be used to also provide similar functionality to the PackedIntArray class above.

    0 讨论(0)
提交回复
热议问题