问题
I have 2 dictionaries, dict1
and dict2
which contain the same keys, but different values for the keys. What I want to do is for each dictionary, sort the values from largest to smallest, and then give each value a rank 1-N, 1 being the largest value. From here, I want to get the difference of the ranks for the values in each dictionary for the same key. For example:
dict1 = {a:0.6, b:0.3, c:0.9, d:1.2, e:0.2}
dict2 = {a:1.4, b:7.7, c:9.0, d:2.5, e:2.0}
# sorting by values would look like this:
dict1 = {d:1.2, c:0.9, a:0.6, b:0.3, e:0.2}
dict2 = {c:9.0, b:7.7, d:2.5, e:2.0, a:1.4}
#ranking the values would produce this:
dict1 = {d:1, c:2, a:3, b:4, e:5}
dict2 = {c:1, b:2, d:3, e:4, a:5}
#computing the difference between ranks would be something like this:
diffs = {}
for x in dict1.keys():
diffs[x] = (dict1[x] - dict2[x])
#diffs would look like this:
diffs[a] = -2
diffs[b] = 2
diffs[c] = 1
diffs[d] = -2
diffs[e] = 1
I know dictionaries are meant to be random and not sortable, but maybe there is a method to put the keys and values into a list? The main challenges I am facing are getting the keys and values sorted by value (largest to smallest) and then changing the value to its respective rank in the sorted list.
回答1:
A simple solution for small dicts is
dict1 = {"a":0.6, "b":0.3, "c":0.9, "d":1.2, "e":0.2}
dict2 = {"a":1.4, "b":7.7, "c":9.0, "d":2.5, "e":2.0}
k1 = sorted(dict1, key=dict1.get)
k2 = sorted(dict2, key=dict2.get)
diffs = dict((k, k2.index(k) - k1.index(k)) for k in dict1)
A more efficient, less readable version for larger dicts:
ranks1 = dict(map(reversed, enumerate(sorted(dict1, key=dict1.get))))
ranks2 = dict(map(reversed, enumerate(sorted(dict2, key=dict2.get))))
diffs = dict((k, ranks2[k] - ranks1[k]) for k in dict1)
回答2:
You may be interested in collections.OrderedDict
Here's a sample, my initial thougth is you were also looking for dictionaries with keys ordered by values, things that od1
and od2
are.
d1 = {"a":0.6, "b":0.3, "c":0.9, "d":1.2, "e":0.2}
d2 = {"a":1.4, "b":7.7, "c":9.0, "d":2.5, "e":2.0}
od1 = OrderedDict(sorted(d1.items(), key=lambda t: t[1]))
od2 = OrderedDict(sorted(d2.items(), key=lambda t: t[1]))
k1 = od1.keys()
k2 = od2.keys()
diff = dict((k, n - k2.index(k)) for n, k in enumerate(k1))
If you don't need them then Sven solution is probably faster.
edit: not that faster honestly... (sven.py is his second, more efficient version):
$ cat /tmp/mine.py | time python -m timeit
10000000 loops, best of 3: 0.0842 usec per loop
real 0m 3.69s
user 0m 3.38s
sys 0m 0.03s
$ cat /tmp/sven.py | time python -m timeit
10000000 loops, best of 3: 0.085 usec per loop
real 0m 3.86s
user 0m 3.42s
sys 0m 0.03s
If someone wants to post formatted bigger dicts I'll test them too.
回答3:
What version of python are you using? If 2.7, use OrderedDict.
Per the Python 2.7 docs:
OrderedDict(sorted(d.items(), key=d.get))
If you're using Python 2.4-2.6 you can still use OrderedDict by installing it from pypi here or if you have setuptools, run
easy_install ordereddict
回答4:
A dictionary is not the right data structure to solve this problem. You should convert to sorted lists as soon as possible and produce the dictionary only as the final result. The following sample solution uses iterators and generator expressions where possible, to avoid creating too many (potentially large) helper lists along the way:
def get_ranking(vals):
'''Return a list of pairs: (key, ranking), sorted by key.'''
ranking = sorted(((v, k) for k, v in vals.iteritems()), reverse=True)
return sorted((k, i) for (i, (_v, k)) in enumerate(ranking))
def ranking_diff(rank1, rank2):
return dict((k, v1 - v2) for (k, v1), (_, v2) in itertools.izip(rank1, rank2))
def get_diffs(dict1, dict2):
r1 = get_ranking(dict1)
r2 = get_ranking(dict2)
return ranking_diff(r1, r2)
print get_diffs(dict1, dict2)
# prints: {'a': -2, 'c': 1, 'b': 2, 'e': 1, 'd': -2}
Please note that this solution assumes that both dicts contain exactly the same keys.
来源:https://stackoverflow.com/questions/6046049/python-dictionary-values-sorting