How references to variables are resolved in Python

后端 未结 3 416
無奈伤痛
無奈伤痛 2020-11-29 19:00

This message is a a bit long with many examples, but I hope it will help me and others to better grasp the full story of variables and attribute lookup in Python 2.7.

3条回答
  •  执念已碎
    2020-11-29 19:43

    Long story short, this is a corner case of Python's scoping that is a bit inconsistent, but has to be kept for backwards compatibility (and because it's not that clear what the right answer should be). You can see lots of the original discussion about it on the Python mailing list when PEP 227 was being implemented, and some in the bug for which this behaviour is the fix.

    We can work out why there's a difference using the dis module, which lets us look inside code objects to see the bytecode a piece of code has been compiled to. I'm on Python 2.6, so the details of this might be slightly different - but I see the same behaviour, so I think it's probably close enough to 2.7.

    The code that initialises each nested MyClass lives in a code object that you can get to via the attributes of the top-level functions. (I'm renaming the functions from example 5 and example 6 to f1 and f2 respectively.)

    The code object has a co_consts tuple, which contains the myfunc code object, which in turn has the code that runs when MyClass gets created:

    In [20]: f1.func_code.co_consts
    Out[20]: (None,
     'x in f2',
     ", line 4>)
    In [21]: myfunc1_code = f1.func_code.co_consts[2]
    In [22]: MyClass1_code = myfunc1_code.co_consts[3]
    In [23]: myfunc2_code = f2.func_code.co_consts[2]
    In [24]: MyClass2_code = myfunc2_code.co_consts[3]
    

    Then you can see the difference between them in bytecode using dis.dis:

    In [25]: from dis import dis
    In [26]: dis(MyClass1_code)
      6           0 LOAD_NAME                0 (__name__)
                  3 STORE_NAME               1 (__module__)
    
      7           6 LOAD_NAME                2 (x)
                  9 STORE_NAME               2 (x)
    
      8          12 LOAD_NAME                2 (x)
                 15 PRINT_ITEM          
                 16 PRINT_NEWLINE       
                 17 LOAD_LOCALS         
                 18 RETURN_VALUE        
    
    In [27]: dis(MyClass2_code)
      6           0 LOAD_NAME                0 (__name__)
                  3 STORE_NAME               1 (__module__)
    
      7           6 LOAD_DEREF               0 (x)
                  9 STORE_NAME               2 (y)
    
      8          12 LOAD_NAME                2 (y)
                 15 PRINT_ITEM          
                 16 PRINT_NEWLINE       
                 17 LOAD_LOCALS         
                 18 RETURN_VALUE        
    

    So the only difference is that in MyClass1, x is loaded using the LOAD_NAME op, while in MyClass2, it's loaded using LOAD_DEREF. LOAD_DEREF looks up a name in an enclosing scope, so it gets 'x in myfunc'. LOAD_NAME doesn't follow nested scopes - since it can't see the x names bound in myfunc or f1, it gets the module-level binding.

    Then the question is, why does the code of the two versions of MyClass get compiled to two different opcodes? In f1 the binding is shadowing x in the class scope, while in f2 it's binding a new name. If the MyClass scopes were nested functions instead of classes, the y = x line in f2 would be compiled the same, but the x = x in f1 would be a LOAD_FAST - this is because the compiler would know that x is bound in the function, so it should use the LOAD_FAST to retrieve a local variable. This would fail with an UnboundLocalError when it was called.

    In [28]:  x = 'x in module'
    def  f3():
        x = 'x in f2'
        def myfunc():
            x = 'x in myfunc'
            def MyFunc():
                x = x
                print x
            return MyFunc()
        myfunc()
    f3()
    ---------------------------------------------------------------------------
    Traceback (most recent call last)
     in ()
          9         return MyFunc()
         10     myfunc()
    ---> 11 f3()
    
     in f3()
          8             print x
          9         return MyFunc()
    ---> 10     myfunc()
         11 f3()
    
     in myfunc()
          7             x = x
          8             print x
    ----> 9         return MyFunc()
         10     myfunc()
         11 f3()
    
     in MyFunc()
          5         x = 'x in myfunc'
          6         def MyFunc():
    ----> 7             x = x
          8             print x
          9         return MyFunc()
    
    UnboundLocalError: local variable 'x' referenced before assignment
    

    This fails because the MyFunc function then uses LOAD_FAST:

    In [31]: myfunc_code = f3.func_code.co_consts[2]
    MyFunc_code = myfunc_code.co_consts[2]
    In [33]: dis(MyFunc_code)
      7           0 LOAD_FAST                0 (x)
                  3 STORE_FAST               0 (x)
    
      8           6 LOAD_FAST                0 (x)
                  9 PRINT_ITEM          
                 10 PRINT_NEWLINE       
                 11 LOAD_CONST               0 (None)
                 14 RETURN_VALUE        
    

    (As an aside, it's not a big surprise that there should be a difference in how scoping interacts with code in the body of classes and code in a function. You can tell this because bindings at the class level aren't available in methods - method scopes aren't nested inside the class scope in the same way as nested functions are. You have to explicitly reach them via the class, or by using self. (which will fall back to the class if there's not also an instance-level binding).)

提交回复
热议问题