I am trying using the itertools.product
function to make a segment of my code (in an isotopic pattern simulator) easier to read and hopefully faster as well (th
I timed these two functions, which use the absolute minimum of extra code:
def nested_for(first_iter, second_iter):
for i in first_iter:
for j in second_iter:
pass
def using_product(first_iter, second_iter):
for i in product(first_iter, second_iter):
pass
Their bytecode instructions are similar:
dis(nested_for)
2 0 SETUP_LOOP 26 (to 28)
2 LOAD_FAST 0 (first_iter)
4 GET_ITER
>> 6 FOR_ITER 18 (to 26)
8 STORE_FAST 2 (i)
3 10 SETUP_LOOP 12 (to 24)
12 LOAD_FAST 1 (second_iter)
14 GET_ITER
>> 16 FOR_ITER 4 (to 22)
18 STORE_FAST 3 (j)
4 20 JUMP_ABSOLUTE 16
>> 22 POP_BLOCK
>> 24 JUMP_ABSOLUTE 6
>> 26 POP_BLOCK
>> 28 LOAD_CONST 0 (None)
30 RETURN_VALUE
dis(using_product)
2 0 SETUP_LOOP 18 (to 20)
2 LOAD_GLOBAL 0 (product)
4 LOAD_FAST 0 (first_iter)
6 LOAD_FAST 1 (second_iter)
8 CALL_FUNCTION 2
10 GET_ITER
>> 12 FOR_ITER 4 (to 18)
14 STORE_FAST 2 (i)
3 16 JUMP_ABSOLUTE 12
>> 18 POP_BLOCK
>> 20 LOAD_CONST 0 (None)
22 RETURN_VALUE
And here are the results:
>>> timer = partial(timeit, number=1000, globals=globals())
>>> timer("nested_for(range(100), range(100))")
0.1294467518782625
>>> timer("using_product(range(100), range(100))")
0.4335527486212385
The results of additional tests performed via timeit
and manual use of perf_counter
were consistent with those above. Using product
is clearly substantially slower than the use of nested for
loops. However, based on the tests already displayed in previous answers, the discrepancy between the two approaches is inversely proportional to the number of nested loops (and, of course, the size of the tuple containing the Cartesian product).