In Python 3, is a list comprehension simply syntactic sugar for a generator expression fed into the list function?
e.g. is the following code:
Both work differently. The list comprehension version takes advantage of the special bytecode LIST_APPEND which calls PyList_Append directly for us. Hence it avoids an attribute lookup to list.append and a function call at the Python level.
>>> def func_lc():
[x**2 for x in y]
...
>>> dis.dis(func_lc)
2 0 LOAD_CONST 1 (<code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>)
3 LOAD_CONST 2 ('func_lc.<locals>.<listcomp>')
6 MAKE_FUNCTION 0
9 LOAD_GLOBAL 0 (y)
12 GET_ITER
13 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
16 POP_TOP
17 LOAD_CONST 0 (None)
20 RETURN_VALUE
>>> lc_object = list(dis.get_instructions(func_lc))[0].argval
>>> lc_object
<code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>
>>> dis.dis(lc_object)
2 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 16 (to 25)
9 STORE_FAST 1 (x)
12 LOAD_FAST 1 (x)
15 LOAD_CONST 0 (2)
18 BINARY_POWER
19 LIST_APPEND 2
22 JUMP_ABSOLUTE 6
>> 25 RETURN_VALUE
On the other hand the list() version simply passes the generator object to list's __init__ method which then calls its extend method internally. As the object is not a list or tuple, CPython then gets its iterator first and then simply adds the items to the list until the iterator is exhausted:
>>> def func_ge():
list(x**2 for x in y)
...
>>> dis.dis(func_ge)
2 0 LOAD_GLOBAL 0 (list)
3 LOAD_CONST 1 (<code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>)
6 LOAD_CONST 2 ('func_ge.<locals>.<genexpr>')
9 MAKE_FUNCTION 0
12 LOAD_GLOBAL 1 (y)
15 GET_ITER
16 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
19 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
22 POP_TOP
23 LOAD_CONST 0 (None)
26 RETURN_VALUE
>>> ge_object = list(dis.get_instructions(func_ge))[1].argval
>>> ge_object
<code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>
>>> dis.dis(ge_object)
2 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 15 (to 21)
6 STORE_FAST 1 (x)
9 LOAD_FAST 1 (x)
12 LOAD_CONST 0 (2)
15 BINARY_POWER
16 YIELD_VALUE
17 POP_TOP
18 JUMP_ABSOLUTE 3
>> 21 LOAD_CONST 1 (None)
24 RETURN_VALUE
>>>
Timing comparisons:
>>> %timeit [x**2 for x in range(10**6)]
1 loops, best of 3: 453 ms per loop
>>> %timeit list(x**2 for x in range(10**6))
1 loops, best of 3: 478 ms per loop
>>> %%timeit
out = []
for x in range(10**6):
out.append(x**2)
...
1 loops, best of 3: 510 ms per loop
Normal loops are slightly slow due to slow attribute lookup. Cache it and time again.
>>> %%timeit
out = [];append=out.append
for x in range(10**6):
append(x**2)
...
1 loops, best of 3: 467 ms per loop
Apart from the fact that list comprehension don't leak the variables anymore one more difference is that something like this is not valid anymore:
>>> [x**2 for x in 1, 2, 3] # Python 2
[1, 4, 9]
>>> [x**2 for x in 1, 2, 3] # Python 3
File "<ipython-input-69-bea9540dd1d6>", line 1
[x**2 for x in 1, 2, 3]
^
SyntaxError: invalid syntax
>>> [x**2 for x in (1, 2, 3)] # Add parenthesis
[1, 4, 9]
>>> for x in 1, 2, 3: # Python 3: For normal loops it still works
print(x**2)
...
1
4
9
You can actually show that the two can have different outcomes to prove they are inherently different:
>>> list(next(iter([])) if x > 3 else x for x in range(10))
[0, 1, 2, 3]
>>> [next(iter([])) if x > 3 else x for x in range(10)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
StopIteration
The expression inside the comprehension is not treated as a generator since the comprehension does not handle the StopIteration, whereas the list constructor does.
They aren't the same, list() will evaluate what ever is given to it after what is in the parentheses has finished executing, not before.
The [] in python is a bit magical, it tells python to wrap what ever is inside it as a list, more like a type hint for the language.
Both forms create and call an anonymous function. However, the list(...) form creates a generator function and passes the returned generator-iterator to list, while with the [...] form, the anonymous function builds the list directly with LIST_APPEND opcodes.
The following code gets decompilation output of the anonymous functions for an example comprehension and its corresponding genexp-passed-to-list:
import dis
def f():
[x for x in []]
def g():
list(x for x in [])
dis.dis(f.__code__.co_consts[1])
dis.dis(g.__code__.co_consts[1])
The output for the comprehension is
4 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 12 (to 21)
9 STORE_FAST 1 (x)
12 LOAD_FAST 1 (x)
15 LIST_APPEND 2
18 JUMP_ABSOLUTE 6
>> 21 RETURN_VALUE
The output for the genexp is
7 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 11 (to 17)
6 STORE_FAST 1 (x)
9 LOAD_FAST 1 (x)
12 YIELD_VALUE
13 POP_TOP
14 JUMP_ABSOLUTE 3
>> 17 LOAD_CONST 0 (None)
20 RETURN_VALUE