Check that Python dicts have same shape and keys

﹥>﹥吖頭↗ 提交于 2021-01-27 04:40:08

问题


For single layer dicts like x = {'a': 1, 'b': 2} the problem is easy and answered on SO (Pythonic way to check if two dictionaries have the identical set of keys?) but what about nested dicts?

For example, y = {'a': {'c': 3}, 'b': {'d': 4}} has keys 'a' and 'b' but I want to compare its shape to another nested dict structure like z = {'a': {'c': 5}, 'b': {'d': 6}} which has the same shape and keys (different values is fine) as y. w = {'a': {'c': 3}, 'b': {'e': 4}} would have keys 'a' and 'b' but on the next layer in it differs from y because w['b'] has key 'e' while y['b'] has key 'd'.

Want a short/simple function of two arguments dict_1 and dict_2 and return True if they have same shape and key as described above, and False otherwise.


回答1:


This provides a copy of both dictionaries stripped of any non-dictionary values, then compares them:

def getshape(d):
    if isinstance(d, dict):
        return {k:getshape(d[k]) for k in d}
    else:
        # Replace all non-dict values with None.
        return None

def shape_equal(d1, d2):
    return getshape(d1) == getshape(d2)



回答2:


I liked nneonneo's answer, and it should be relatively fast, but I want something that didn't create extra unnecessary data structures (I've been learning about memory fragmentation in Python). This may or may not be as fast or faster.

(EDIT: Spoiler!)

Faster by a decent enough margin to make it preferable in all cases, see the other analysis answer.

But if dealing with lots and lots of these and having memory problems, it is likely to be preferable to do it this way.

Implementation

This should work in Python 3, maybe 2.7 if you translate keys to viewkeys, definitely not 2.6. It relies on the set-like view of the keys that dicts have:

def sameshape(d1, d2):
    if isinstance(d1, dict):
        if isinstance(d2, dict):
            # then we have shapes to check
            return (d1.keys() == d2.keys() and
                    # so the keys are all the same
                    all(sameshape(d1[k], d2[k]) for k in d1.keys()))
                    # thus all values will be tested in the same way.
        else:
            return False # d1 is a dict, but d2 isn't
    else:
        return not isinstance(d2, dict) # if d2 is a dict, False, else True.

Edit updated to reduce redundant type check, now even more efficient.

Testing

To check:

print('expect false:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None: {} }}}))
print('expect true:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None:'foo'}}}))
print('expect false:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None:None, 'baz':'foo'}}}))

Prints:

expect false:
False
expect true:
True
expect false:
False



回答3:


To profile the two currently existing answers, first lets import timeit:

import timeit

Now we need to setup the code:

setup = '''
import copy

def getshape(d):
    if isinstance(d, dict):
        return {k:getshape(d[k]) for k in d}
    else:
        # Replace all non-dict values with None.
        return None

def nneo_shape_equal(d1, d2):
    return getshape(d1) == getshape(d2)

def aaron_shape_equal(d1,d2):
    if isinstance(d1, dict) and isinstance(d2, dict):
        return (d1.keys() == d2.keys() and 
                all(aaron_shape_equal(d1[k], d2[k]) for k in d1.keys()))
    else:
        return not (isinstance(d1, dict) or isinstance(d2, dict))

class Vividict(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

d = Vividict()

d['foo']['bar']
d['foo']['baz']
d['fizz']['buzz']
d['primary']['secondary']['tertiary']['quaternary']

d0 = copy.deepcopy(d)
d1 = copy.deepcopy(d)
d1['primary']['secondary']['tertiary']['extra']
# d == d0 is True
# d == d1 is now False!
'''

And now let's test the two options out, first with Python 3.3!

>>> timeit.repeat('nneo_shape_equal(d0, d); nneo_shape_equal(d1,d)', setup=setup)
[36.784881490981206, 36.212246977956966, 36.29759863798972]

And it looks like my solution takes 2/3rd to 3/4th the time, making it more than 1.25 times as fast.

>>> timeit.repeat('aaron_shape_equal(d0, d); aaron_shape_equal(d1,d)', setup=setup)
[26.838892214931548, 26.61037168605253, 27.170253590098582]

And on a version of Python 3.4 (an alpha) that I compiled myself:

>>> timeit.repeat('nneo_shape_equal(d0, d); nneo_shape_equal(d1,d)', setup=setup)
[272.5629618819803, 273.49581588001456, 270.13374400604516]
>>> timeit.repeat('aaron_shape_equal(d0, d); aaron_shape_equal(d1,d)', setup=setup)
[214.87033835891634, 215.69223327597138, 214.85333003790583]

Still about the same ratio. The time difference between the two is likely because I self-compiled 3.4 without optimizations.

Thanks to all readers!



来源:https://stackoverflow.com/questions/24192748/check-that-python-dicts-have-same-shape-and-keys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!