问题
I implemented the __contains__
method on a class for the first time the other day, and the behavior wasn't what I expected. I suspect there's some subtlety to the in operator that I don't understand and I was hoping someone could enlighten me.
It appears to me that the in
operator doesn't simply wrap an object's __contains__
method, but it also attempts to coerce the output of __contains__
to boolean. For example, consider the class
class Dummy(object):
def __contains__(self, val):
# Don't perform comparison, just return a list as
# an example.
return [False, False]
The in
operator and a direct call to the __contains__
method return very different output:
>>> dum = Dummy()
>>> 7 in dum
True
>>> dum.__contains__(7)
[False, False]
Again, it looks like in
is calling __contains__
but then coercing the result to bool
. I can't find this behavior documented anywhere except for the fact that the __contains__
documentation says __contains__
should only ever return True
or False
.
I'm happy following the convention, but can someone tell me the precise relationship between in
and __contains__
?
Epilogue
I decided to choose @eli-korvigo answer, but everyone should look at @ashwini-chaudhary comment about the bug, below.
回答1:
Use the source, Luke!
Let's trace down the in
operator implementation
>>> import dis
>>> class test(object):
... def __contains__(self, other):
... return True
>>> def in_():
... return 1 in test()
>>> dis.dis(in_)
2 0 LOAD_CONST 1 (1)
3 LOAD_GLOBAL 0 (test)
6 CALL_FUNCTION 0 (0 positional, 0 keyword pair)
9 COMPARE_OP 6 (in)
12 RETURN_VALUE
As you can see, the in
operator becomes the COMPARE_OP
virtual machine instruction. You can find that in ceval.c
TARGET(COMPARE_OP)
w = POP();
v = TOP();
x = cmp_outcome(oparg, v, w);
Py_DECREF(v);
Py_DECREF(w);
SET_TOP(x);
if (x == NULL) break;
PREDICT(POP_JUMP_IF_FALSE);
PREDICT(POP_JUMP_IF_TRUE);
DISPATCH();
Take a look at one of the switches in cmp_outcome()
case PyCmp_IN:
res = PySequence_Contains(w, v);
if (res < 0)
return NULL;
break;
Here we have the PySequence_Contains
call
int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
Py_ssize_t result;
PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
if (sqm != NULL && sqm->sq_contains != NULL)
return (*sqm->sq_contains)(seq, ob);
result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}
That always returns an int
(a boolean).
P.S.
Thanks to Martijn Pieters for providing the way to find the implementation of the in
operator.
回答2:
In Python reference for __contains__ it's written that __contains__
should return True
or False
.
If the return value is not boolean it's converted to boolean. Here is proof:
class MyValue:
def __bool__(self):
print("__bool__ function ran")
return True
class Dummy:
def __contains__(self, val):
return MyValue()
Now write in shell:
>>> dum = Dummy()
>>> 7 in dum
__bool__ function ran
True
And bool()
of nonempty list returns True
.
Edit:
It's only documentation for __contains__
, if you really want to see precise relation you should consider looking into source code although I'm not sure where exactly, but it's already answered. In documentation for comparison it's written:
However, these methods can return any value, so if the comparison operator is used in a Boolean context (e.g., in the condition of an
if
statement), Python will call bool() on the value to determine if the result is true or false.
So you can guess that it's similar with __contains__
.
来源:https://stackoverflow.com/questions/38542543/functionality-of-python-in-vs-contains