How to implement an efficient bidirectional hash table? [duplicate]

只谈情不闲聊 提交于 2019-11-26 10:22:00

Here is a class for a bidirectional dict, inspired by Finding key from value in Python dictionary and modified to allow the following 2) and 3).

Note that :

  • 1) The inverse directory bd.inverse auto-updates itself when the standard dict bd is modified.
  • 2) The inverse directory bd.inverse[value] is always a list of key such that bd[key] == value.
  • 3) Unlike the bidict module from https://pypi.python.org/pypi/bidict, here we can have 2 keys having same value, this is very important.

Code:

class bidict(dict):
    def __init__(self, *args, **kwargs):
        super(bidict, self).__init__(*args, **kwargs)
        self.inverse = {}
        for key, value in self.items():
            self.inverse.setdefault(value,[]).append(key) 

    def __setitem__(self, key, value):
        if key in self:
            self.inverse[self[key]].remove(key) 
        super(bidict, self).__setitem__(key, value)
        self.inverse.setdefault(value,[]).append(key)        

    def __delitem__(self, key):
        self.inverse.setdefault(self[key],[]).remove(key)
        if self[key] in self.inverse and not self.inverse[self[key]]: 
            del self.inverse[self[key]]
        super(bidict, self).__delitem__(key)

Usage example:

bd = bidict({'a': 1, 'b': 2})  
print(bd)                     # {'a': 1, 'b': 2}                 
print(bd.inverse)             # {1: ['a'], 2: ['b']}
bd['c'] = 1                   # Now two keys have the same value (= 1)
print(bd)                     # {'a': 1, 'c': 1, 'b': 2}
print(bd.inverse)             # {1: ['a', 'c'], 2: ['b']}
del bd['c']
print(bd)                     # {'a': 1, 'b': 2}
print(bd.inverse)             # {1: ['a'], 2: ['b']}
del bd['a']
print(bd)                     # {'b': 2}
print(bd.inverse)             # {2: ['b']}
bd['b'] = 3
print(bd)                     # {'b': 3}
print(bd.inverse)             # {2: [], 3: ['b']}

You can use the same dict itself by adding key,value pair in reverse order.

d={'a':1,'b':2}
revd=dict([reversed(i) for i in d.items()])
d.update(revd)
miku

A poor man's bidirectional hash table would be to use just two dictionaries (these are highly tuned datastructures already).

There is also a bidict package on the index:

The source for bidict can be found on github:

The below snippet of code implements an invertible (bijective) map:

class BijectionError(Exception):
    """Must set a unique value in a BijectiveMap."""

    def __init__(self, value):
        self.value = value
        msg = 'The value "{}" is already in the mapping.'
        super().__init__(msg.format(value))


class BijectiveMap(dict):
    """Invertible map."""

    def __init__(self, inverse=None):
        if inverse is None:
            inverse = self.__class__(inverse=self)
        self.inverse = inverse

    def __setitem__(self, key, value):
        if value in self.inverse:
            raise BijectionError(value)

        self.inverse._set_item(value, key)
        self._set_item(key, value)

    def __delitem__(self, key):
        self.inverse._del_item(self[key])
        self._del_item(key)

    def _del_item(self, key):
        super().__delitem__(key)

    def _set_item(self, key, value):
        super().__setitem__(key, value)

The advantage of this implementation is that the inverse attribute of a BijectiveMap is again a BijectiveMap. Therefore you can do things like:

>>> foo = BijectiveMap()
>>> foo['steve'] = 42
>>> foo.inverse
{42: 'steve'}
>>> foo.inverse.inverse
{'steve': 42}
>>> foo.inverse.inverse is foo
True
Matt Anderson

Something like this, maybe:

import itertools

class BidirDict(dict):
    def __init__(self, iterable=(), **kwargs):
        self.update(iterable, **kwargs)
    def update(self, iterable=(), **kwargs):
        if hasattr(iterable, 'iteritems'):
            iterable = iterable.iteritems()
        for (key, value) in itertools.chain(iterable, kwargs.iteritems()):
            self[key] = value
    def __setitem__(self, key, value):
        if key in self:
            del self[key]
        if value in self:
            del self[value]
        dict.__setitem__(self, key, value)
        dict.__setitem__(self, value, key)
    def __delitem__(self, key):
        value = self[key]
        dict.__delitem__(self, key)
        dict.__delitem__(self, value)
    def __repr__(self):
        return '%s(%s)' % (type(self).__name__, dict.__repr__(self))

You have to decide what you want to happen if more than one key has a given value; the bidirectionality of a given pair could easily be clobbered by some later pair you inserted. I implemented one possible choice.


Example :

bd = BidirDict({'a': 'myvalue1', 'b': 'myvalue2', 'c': 'myvalue2'})
print bd['myvalue1']   # a
print bd['myvalue2']   # b        

First, you have to make sure the key to value mapping is one to one, otherwise, it is not possible to build a bidirectional map.

Second, how large is the dataset? If there is not much data, just use 2 separate maps, and update both of them when updating. Or better, use an existing solution like Bidict, which is just a wrapper of 2 dicts, with updating/deletion built in.

But if the dataset is large, and maintaining 2 dicts is not desirable:

  • If both key and value are numeric, consider the possibility of using Interpolation to approximate the mapping. If the vast majority of the key-value pairs can be covered by the mapping function (and its
    reverse function), then you only need to record the outliers in maps.

  • If most of access is uni-directional (key->value), then it is totally ok to build the reverse map incrementally, to trade time for
    space.

Code:

d = {1: "one", 2: "two" }
reverse = {}

def get_key_by_value(v):
    if v not in reverse:
        for _k, _v in d.items():
           if _v == v:
               reverse[_v] = _k
               break
    return reverse[v]
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!