Python: sort function breaks in the presence of nan

前端 未结 6 1973
隐瞒了意图╮
隐瞒了意图╮ 2020-11-29 07:34

sorted([2, float(\'nan\'), 1]) returns [2, nan, 1]

(At least on Activestate Python 3.1 implementation.)

I understand nan

6条回答
  •  执笔经年
    2020-11-29 08:01

    Regardless of standards, there are many cases where a user-defined ordering of float and NA values is useful. For instance, I was sorting stock returns and wanted highest to lowest with NA last (since those were irrelevant). There are 4 possible combinations

    1. Ascending float values, NA values last
    2. Ascending float values, NA values first
    3. Descending float values, NA values last
    4. Descending float values, NA values first

    Here is a function that covers all scenarios by conditionally replacing NA values with +/- inf

    import math 
    
    def sort_with_na(x, reverse=False, na_last=True):
        """Intelligently sort iterable with NA values
    
        For reliable behavior with NA values, we should change the NAs to +/- inf
        to guarantee their order rather than relying on the built-in
        ``sorted(reverse=True)`` which will have no effect. To use the ``reverse``
        parameter or other kwargs, use functools.partial in your lambda i.e.
    
            sorted(iterable, key=partial(sort_with_na, reverse=True, na_last=False))
    
        :param x: Element to be sorted
        :param bool na_last: Whether NA values should come last or first
        :param bool reverse: Return ascending if ``False`` else descending
        :return bool:
        """
        if not math.isnan(x):
            return -x if reverse else x
        else:
            return float('inf') if na_last else float('-inf')
    

    Testing out each of the 4 combinations

    from functools import partial
    
    a = [2, float('nan'), 1]
    sorted(a, key=sort_with_na)                                         # Default
    sorted(a, key=partial(sort_with_na, reverse=False, na_last=True))   # Ascend, NA last
    sorted(a, key=partial(sort_with_na, reverse=False, na_last=False))  # Ascend, NA first
    sorted(a, key=partial(sort_with_na, reverse=True, na_last=True))    # Descend, NA last
    sorted(a, key=partial(sort_with_na, reverse=True, na_last=False))   # Descend, NA first
    

提交回复
热议问题