Odd behavior of numpy.all with object dtypes

前端 未结 1 1545
离开以前
离开以前 2020-12-22 01:29

Given an array of dtype=object, numpy.all/any return the last object. For example:

>>> from string import ascii_lowercase         


        
1条回答
  •  梦毁少年i
    2020-12-22 01:30

    In numpy version 1.8.2, np.any and np.all behave as classic short circuit logical and/or functions. LISP behavor comes to mind. Python's and and or operators do this.

    Some examples:

    In [203]: np.all(np.array([[1,2],1,[],[1,2,3]],dtype=object))
    Out[203]: []
    
    In [204]: np.any(np.array([[1,2],1,[],[1,2,3]],dtype=object))
    Out[204]: [1, 2]
    
    In [205]: np.any(np.array([0,[],[1,2],1,[],[1,2,3]],dtype=object))
    Out[205]: [1, 2]
    
    In [206]: np.all(np.array([True,False,[1,2],1,[],[1,2,3]],dtype=object))
    Out[206]: False
    

    np.all returns the first item that is logically False; else the last item. np.any the first item that is logically True; else the last item.

    In the LISP world this is regarded as a useful feature. Not only does it stop evaluating elements as soon as the result is clear, but the identity of that return value can be used.

    Is there a way of replicating this behavior using the and/or operators and some sort of map or reduce?

    In [8]: 0 or [] or [1,2] or 1 or [1,2,3]
    Out[8]: [1, 2]
    
    ???([0,[],[1,2],1,[1,2,3]])
    

    as suggested in the comment:

    In [26]: reduce(lambda a,b:a and b, np.array([1,2,3,[1,2,3]],dtype=object))
    Out[26]: [1, 2, 3]
    

    This might not actually short circuit the whole loop. Rather it short circuits each step, and propagates that value forward. Using lambda a,b:b and a returns the 1st item in the list, not the last. Timings could be used to test whether it is looping through the whole array (or not).


    np.all is a ufunc that is defined as np.logical_and.reduce.

    https://github.com/numpy/numpy/blob/master/numpy/core/_methods.py

    umr_all = um.logical_and.reduce
    def _all(a, axis=None, dtype=None, out=None, keepdims=False):
        return umr_all(a, axis, dtype, out, keepdims)
    

    logical_and for dtype=object is defined in c source

    https://github.com/numpy/numpy/blob/master/numpy/core/src/umath/funcs.inc.src

    /* Emulates Python's 'a and b' behavior */
    static PyObject *
    npy_ObjectLogicalAnd(PyObject *i1, PyObject *i2)
    

    similarly for np.any. Numeric dtype versions are defined else where.

    There's a patch that forces np.all/any to return dtype=bool. But by calling np.logical_all directly you can control this yourself.

    In [304]: np.logical_or.reduce(np.array([0,[1,2,3],4],dtype=object))
    Out[304]: [1, 2, 3]
    
    In [305]: np.logical_or.reduce(np.array([0,[1,2,3],4],dtype=object),dtype=bool)
    Out[305]: True
    

    0 讨论(0)
提交回复
热议问题