问题
With the help of this answer, I'm trying to come up with a function that searches after a key in a nested Python dict and also records the "path" of each match. My function (see below) seems to work, however it is not possible to save the result in a list (see code output). I'm pretty certain that the difficulty lies in the yield command, but I have not been able to figure it out yet.
o={
'dict1': {
'dict11': {
'entry11_1':1,
'entry11_2':2,
},
'dict12': {
'entry12_1':12,
'entry12_2':22,
},
},
'dict2': {
'dict21': {
'entry21_1':21,
}
},
}
curr_pos=[]
def gen_dict_extract(key, var):
global curr_pos
if hasattr(var,'iteritems'):
for k, v in var.iteritems():
#print curr_pos
if k == key:
yield v,curr_pos
if isinstance(v, dict):
curr_pos.append(k)
for result in gen_dict_extract(key, v):
yield result
elif isinstance(v, list):
for d in v:
for result in gen_dict_extract(key, d):
yield result
if len(curr_pos)>0:
curr_pos.pop()
result_list=[]
for ind,i in enumerate(gen_dict_extract('entry12_1',o)):
result_list.append(i)
print result_list[-1]
print result_list[-1]
Output:
(12, ['dict1', 'dict12'])
(12, [])
回答1:
In gen_dict_extract you use a global list curr_pos and directly yield it when you have found the key (yield v,curr_pos). But a list is a mutable type, and you later modify it (curr_pos.pop())
What you have stored in result_list is just a reference to the global object, so it contains the expected value inside the loop, but is emptied at the end of the loop. You should just return a shallow copy at yield time: yield v,curr_pos[:]
You will then get as expected:
(12, ['dict1', 'dict12'])
(12, ['dict1', 'dict12'])
BTW, it you want to avoid a global list, you could pass the list as an optional parameter:
def gen_dict_extract(key, var, curr_pos = None):
if curr_pos is None:
curr_pos = []
...
for result in gen_dict_extract(key, v, curr_pos):
...
for result in gen_dict_extract(key, d, curr_pos):
...
That would ensure that you use a new list on each fresh invocation, while correctly passing it when recursing
回答2:
The problem is that i is a tupple object. You need to copy i for avoid overwrite.
import copy
result_list = []
for in ind in enumerate(gen_dict_extract('entry12_1',o)):
result_list.append(copy.deepcopy(i))
print result_list
回答3:
For the sake of completeness, here's a version with Serge's suggestions. Also I made some additional changes so the function is able to cope with any nested list and dict combination.
def gen_dict_extract(key, var,curr_pos=None):
"""
key: key to search for
var: nested dict to search in
"""
#print curr_pos
if curr_pos is None:
curr_pos=[]
if hasattr(var,'iteritems'):
for k, v in var.iteritems():
curr_pos.append(k)
if k == key:
yield v,curr_pos[:]
if isinstance(v, dict):
for result in gen_dict_extract(key, v,curr_pos):
yield result
elif isinstance(v, list):
curr_pos.append(0)
for ind,d in enumerate(v):
curr_pos.pop()
curr_pos.append(ind)
for result in gen_dict_extract(key, d,curr_pos):
yield result
curr_pos.pop()
curr_pos.pop()
elif isinstance(var, list):
curr_pos.append(0)
for ind,d in enumerate(var):
curr_pos.pop()
curr_pos.append(ind)
for result in gen_dict_extract(key, d,curr_pos):
yield result
curr_pos.pop()
来源:https://stackoverflow.com/questions/33940483/search-in-nested-python-dict-and-record-path