Sort a dictionary with custom sorting function

纵然是瞬间 提交于 2020-01-07 07:58:19

问题


I have some JSON data I read from a file using json.load(data_file)

 {
  "unused_account":{
    "logins": 0,
    "date_added": 150
  },
  "unused_account2":{
    "logins": 0,
    "date_added": 100
  },
  "power_user_2": {
    "logins": 500,
    "date_added": 400,
    "date_used": 500
  },
  "power_user": {
    "logins": 500,
    "date_added": 300,
    "date_used": 400
  },
  "regular_user": {
    "logins": 20,
    "date_added": 200,
    "date_used": 300
  }
}

I want to sort the entries in a specific order. I have found lots of examples to sort by key or one single value. But I would like to sort the values by these rules:

  1. groupby logins descending, but users with 0 logins first
  2. sort users with 0 logins by date_added
  3. sort users with at least 1 login by date_used

Ideally I would write my own compare function like this:

def compare(elem1, elem2):
    """Return >0 if elem2 is greater than elem1
        <0 if elem2 is lesser than elem1
        0 if they are equal"""
    #rule 1 group by logins
    if elem1['logins'] != elem2['logins']:
        if elem1['logins'] == 0:
            return -1
        if elem2['logins'] == 0:
            return 1
        return elem2['logins'] - elem1['logins']
    # rule 2 sort on date_added
    if elem1['logins'] == 0 and elem2['logins'] == 0:
        return elem2['date_added'] - elem1['date_added']
    #rule 3 sort on date_used
    if elem1['logins'] == elem2['logins'] and elem1['loigns'] > 0:
        return elem2['date_used'] - elem1['date_used']
    return 0  # default

I don't know where and how to plugin my sorting function.


回答1:


I'm going to assume you know that dictionaries are unordered and that you want to sort either the values, or the key-value pairs. The following examples sort the values.

Your comparison function already works, provided you fix the loigns typo in the last if:

>>> sorted(sample.itervalues(), cmp=compare))
[{'logins': 0, 'date_added': 150}, {'logins': 0, 'date_added': 100}, {'logins': 500, 'date_added': 400, 'date_used': 500}, {'logins': 500, 'date_added': 300, 'date_used': 400}, {'logins': 20, 'date_added': 200, 'date_used': 300}]
>>> pprint(_)
[{'date_added': 150, 'logins': 0},
 {'date_added': 100, 'logins': 0},
 {'date_added': 400, 'date_used': 500, 'logins': 500},
 {'date_added': 300, 'date_used': 400, 'logins': 500},
 {'date_added': 200, 'date_used': 300, 'logins': 20}]

However, you can use the following sort key too:

(not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])

This creates a tuple of (has_logins, num_logins, date) where the date picked is based on whether or not the user has logged in.

Use it as the key argument to the sorted() function, and reverse the sort, like this:

>>> key = lambda d: (not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])
>>> pprint(sorted(sample.itervalues(), key=key, reverse=True))
[{'date_added': 150, 'logins': 0},
 {'date_added': 100, 'logins': 0},
 {'date_added': 400, 'date_used': 500, 'logins': 500},
 {'date_added': 300, 'date_used': 400, 'logins': 500},
 {'date_added': 200, 'date_used': 300, 'logins': 20}]

If you needed the keys as well, use dict.iteritems() and update the key function to accept a (k, d) tuple:

>>> key = lambda (k, d): (not d['logins'], d['logins'], d['date_used'] if d['logins'] else d['date_added'])
>>> pprint(sorted(sample.iteritems(), key=key, reverse=True))
[('unused_account', {'date_added': 150, 'logins': 0}),
 ('unused_account2', {'date_added': 100, 'logins': 0}),
 ('power_user_2', {'date_added': 400, 'date_used': 500, 'logins': 500}),
 ('power_user', {'date_added': 300, 'date_used': 400, 'logins': 500}),
 ('regular_user', {'date_added': 200, 'date_used': 300, 'logins': 20})]


来源:https://stackoverflow.com/questions/31902857/sort-a-dictionary-with-custom-sorting-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!