问题
I get some surprising results when trying to evaluate
logical expressions on data that might contain nan
values (as defined in numpy).
I would like to understand why this results arise and how to implement the correct way.
What I don't understand is why these expressions evaluate to the value they do:
from numpy import nan
nan and True
>>> True
# this is wrong.. I would expect to evaluate to nan
True and nan
>>> nan
# OK
nan and False
>>> False
# OK regardless the value of the first element
# the expression should evaluate to False
False and nan
>>> False
#ok
Similarly for or
:
True or nan
>>> True #OK
nan or True
>>> nan #wrong the expression is True
False or nan
>>> nan #OK
nan or False
>>> nan #OK
How can I implement (in an efficient way) the correct boolean functions, handling also nan
values?
回答1:
You can use predicates from the numpy
namespace:
>>> np.logical_and(True, np.nan), np.logical_and(False, np.nan)
(True, False)
>>> np.logical_and(np.nan, True), np.logical_and(np.nan, False)
(True, False)
>>>
>>> np.logical_or(True, np.nan), np.logical_or(False, np.nan)
(True, True)
>>> np.logical_or(np.nan, True), np.logical_or(np.nan, False)
(True, True)
EDIT: The built-in boolean operators are slightly different. From the docs :
x and y
is equivalent to if x is false, then x, else y
. So, if the first argument evaluates to False
, they return it (not its boolean equivalent, as it were). Therefore:
>>> (None and True) is None
True
>>> [] and True
[]
>>> [] and False
[]
>>>
etc
回答2:
While evaluating logical expressions containing and
, we have to evaluate the expressions that are present on both sides of the and
operator. Whereas for or
operator, if the first expression is True, then there is no need to check for the correctness of the second expression
E.g., While evaluating the expression 2>2 and 3==3
, first we should check whether the first expression 2>2
is True or not. If this first expression is False, then there is no need to check the second expression because of the AND
operator and the result of such an expression will be FALSE as the first expression is FALSE. Whereas if the expression has been 2==2 AND 3==3
, then since the first expression 2==2
is True, then we need not check the correctness of the second expression and since here the second expression is also True, we get TRUE as the output.
In nan and True
, since nan
is True and because of AND
operator, python will now evaluate the second expression and returns the value of second expression. So, here you will get TRUE
as output. Same logic when applied to True and nan
, you can expect nan
as the output.
In OR
operator, it is sufficient enough if we look at the first expression, hence "True or nan
will return True
来源:https://stackoverflow.com/questions/17273312/python-numpy-nan-and-logical-functions-wrong-results