Why is 'x' in ('x',) faster than 'x' == 'x'?

后端未结

关注

 2  1740

误落风尘 2021-01-29 18:15

>>> timeit.timeit(\"\'x\' in (\'x\',)\")
0.04869917374131205
>>> timeit.timeit(\"\'x\' == \'x\'\")
0.06144205736110564

Also works for

2条回答

我在风中等你 (楼主)

2021-01-29 18:51
There are three factors at play here which, combined, produce this surprising behavior.

First: the in operator takes a shortcut and checks identity (x is y) before it checks equality (x == y):
```
>>> n = float('nan')
>>> n in (n, )
True
>>> n == n
False
>>> n is n
True
```
Second: because of Python's string interning, both "x"s in "x" in ("x", ) will be identical:
```
>>> "x" is "x"
True
```
(big warning: this is implementation-specific behavior! is should never be used to compare strings because it will give surprising answers sometimes; for example "x" * 100 is "x" * 100 ==> False)

Third: as detailed in Veedrac's fantastic answer, tuple.__contains__ (x in (y, ) is roughly equivalent to (y, ).__contains__(x)) gets to the point of performing the identity check faster than str.__eq__ (again, x == y is roughly equivalent to x.__eq__(y)) does.

You can see evidence for this because x in (y, ) is significantly slower than the logically equivalent, x == y:
```
In [18]: %timeit 'x' in ('x', )
10000000 loops, best of 3: 65.2 ns per loop

In [19]: %timeit 'x' == 'x'    
10000000 loops, best of 3: 68 ns per loop

In [20]: %timeit 'x' in ('y', ) 
10000000 loops, best of 3: 73.4 ns per loop

In [21]: %timeit 'x' == 'y'    
10000000 loops, best of 3: 56.2 ns per loop
```
The x in (y, ) case is slower because, after the is comparison fails, the in operator falls back to normal equality checking (i.e., using ==), so the comparison takes about the same amount of time as ==, rendering the entire operation slower because of the overhead of creating the tuple, walking its members, etc.

Note also that a in (b, ) is only faster when a is b:
```
In [48]: a = 1             

In [49]: b = 2

In [50]: %timeit a is a or a == a
10000000 loops, best of 3: 95.1 ns per loop

In [51]: %timeit a in (a, )      
10000000 loops, best of 3: 140 ns per loop

In [52]: %timeit a is b or a == b
10000000 loops, best of 3: 177 ns per loop

In [53]: %timeit a in (b, )      
10000000 loops, best of 3: 169 ns per loop
```
(why is a in (b, ) faster than a is b or a == b? My guess would be fewer virtual machine instructions — a in (b, ) is only ~3 instructions, where a is b or a == b will be quite a few more VM instructions)

Veedrac's answer — https://stackoverflow.com/a/28889838/71522 — goes into much more detail on specifically what happens during each of == and in and is well worth the read.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...