Why does a class definition always produce the same bytecode?

*爱你&永不变心* 提交于 2020-01-01 02:04:06


Say I do:

#!/usr/bin/env python
# encoding: utf-8

class A(object):

Now I disassemble it:

python -m dis test0.py 
  4           0 LOAD_CONST               0 ('A')
              3 LOAD_NAME                0 (object)
              6 BUILD_TUPLE              1
              9 LOAD_CONST               1 (<code object A at 0x1004ebb30, file "test0.py", line 4>)
             12 MAKE_FUNCTION            0
             15 CALL_FUNCTION            0
             18 BUILD_CLASS         
             19 STORE_NAME               1 (A)
             22 LOAD_CONST               2 (None)
             25 RETURN_VALUE        

Now I add some statements in the class definition:

#!/usr/bin/env python
# encoding: utf-8

class A(object):
    print 'hello'

And I disassemble again:

  4           0 LOAD_CONST               0 ('A')
              3 LOAD_NAME                0 (object)
              6 BUILD_TUPLE              1
              9 LOAD_CONST               1 (<code object A at 0x1004ebb30, file "test0.py", line 4>)
             12 MAKE_FUNCTION            0
             15 CALL_FUNCTION            0
             18 BUILD_CLASS         
             19 STORE_NAME               1 (A)
             22 LOAD_CONST               2 (None)
             25 RETURN_VALUE        

What don't the new statements appear in the new bytecode?


The new statements are stored in nested bytecode. You can see in your disassembly that another code object is loaded:

      9 LOAD_CONST               1 (<code object A at 0x1004ebb30, file "test0.py", line 4>)

You need to inspect that code object instead. That's because the class body is executed just like a function object, and the local namespace that call produces is then used to form the class members.


>>> import dis
>>> def wrapper():
...     class A(object):
...         pass
>>> dis.dis(wrapper)
  2           0 LOAD_CONST               1 ('A')
              3 LOAD_GLOBAL              0 (object)
              6 BUILD_TUPLE              1
              9 LOAD_CONST               2 (<code object A at 0x104b99930, file "<stdin>", line 2>)
             12 MAKE_FUNCTION            0
             15 CALL_FUNCTION            0
             18 BUILD_CLASS         
             19 STORE_FAST               0 (A)
             22 LOAD_CONST               0 (None)
             25 RETURN_VALUE        
>>> dis.dis(wrapper.__code__.co_consts[2])
  2           0 LOAD_NAME                0 (__name__)
              3 STORE_NAME               1 (__module__)

  3           6 LOAD_LOCALS         
              7 RETURN_VALUE        

This is the same setup as your first sample; the class body is accessed via the wrapper.__code__.co_consts tuple, which is what the LOAD_CONST byte code refers to; the index is given as 2.

Now we can add a class body:

>>> def wrapper():
...     class A(object):
...         print 'hello'
...         1+1
...         pass
>>> dis.dis(wrapper)
  2           0 LOAD_CONST               1 ('A')
              3 LOAD_GLOBAL              0 (object)
              6 BUILD_TUPLE              1
              9 LOAD_CONST               2 (<code object A at 0x104b4adb0, file "<stdin>", line 2>)
             12 MAKE_FUNCTION            0
             15 CALL_FUNCTION            0
             18 BUILD_CLASS         
             19 STORE_FAST               0 (A)
             22 LOAD_CONST               0 (None)
             25 RETURN_VALUE        
>>> dis.dis(wrapper.__code__.co_consts[2])
  2           0 LOAD_NAME                0 (__name__)
              3 STORE_NAME               1 (__module__)

  3           6 LOAD_CONST               0 ('hello')
              9 PRINT_ITEM          
             10 PRINT_NEWLINE       

  4          11 LOAD_CONST               2 (2)
             14 POP_TOP             

  5          15 LOAD_LOCALS         
             16 RETURN_VALUE        

Now the class body appears; we can see the byte code that'll be executed when the class body is loaded.

Of note are the LOAD_NAME and STORE_NAME bytecodes executed for each class body; those retrieve the module name and store those as a new local name __module__, so that your class will end up with a __module__ attribute once created.

The LOAD_LOCALS bytecode then gathers all the local names produced in this 'function' and returns that to the caller, so that the BUILD_CLASS bytecode can use that together with the 'A' string and the object bases tuple (created with BUILD_TUPLE) can produce your new class object.

