This message is a a bit long with many examples, but I hope it will help me and others to better grasp the full story of variables and attribute lookup in Python 2.7.
Long story short, this is a corner case of Python's scoping that is a bit inconsistent, but has to be kept for backwards compatibility (and because it's not that clear what the right answer should be). You can see lots of the original discussion about it on the Python mailing list when PEP 227 was being implemented, and some in the bug for which this behaviour is the fix.
We can work out why there's a difference using the dis module, which lets us look inside code objects to see the bytecode a piece of code has been compiled to. I'm on Python 2.6, so the details of this might be slightly different - but I see the same behaviour, so I think it's probably close enough to 2.7.
The code that initialises each nested MyClass
lives in a code object that you can get to via the attributes of the top-level functions. (I'm renaming the functions from example 5 and example 6 to f1
and f2
respectively.)
The code object has a co_consts
tuple, which contains the myfunc
code object, which in turn has the code that runs when MyClass
gets created:
In [20]: f1.func_code.co_consts
Out[20]: (None,
'x in f2',
", line 4>)
In [21]: myfunc1_code = f1.func_code.co_consts[2]
In [22]: MyClass1_code = myfunc1_code.co_consts[3]
In [23]: myfunc2_code = f2.func_code.co_consts[2]
In [24]: MyClass2_code = myfunc2_code.co_consts[3]
Then you can see the difference between them in bytecode using dis.dis
:
In [25]: from dis import dis
In [26]: dis(MyClass1_code)
6 0 LOAD_NAME 0 (__name__)
3 STORE_NAME 1 (__module__)
7 6 LOAD_NAME 2 (x)
9 STORE_NAME 2 (x)
8 12 LOAD_NAME 2 (x)
15 PRINT_ITEM
16 PRINT_NEWLINE
17 LOAD_LOCALS
18 RETURN_VALUE
In [27]: dis(MyClass2_code)
6 0 LOAD_NAME 0 (__name__)
3 STORE_NAME 1 (__module__)
7 6 LOAD_DEREF 0 (x)
9 STORE_NAME 2 (y)
8 12 LOAD_NAME 2 (y)
15 PRINT_ITEM
16 PRINT_NEWLINE
17 LOAD_LOCALS
18 RETURN_VALUE
So the only difference is that in MyClass1
, x
is loaded using the LOAD_NAME
op, while in MyClass2
, it's loaded using LOAD_DEREF
. LOAD_DEREF
looks up a name in an enclosing scope, so it gets 'x in myfunc'. LOAD_NAME
doesn't follow nested scopes - since it can't see the x
names bound in myfunc
or f1
, it gets the module-level binding.
Then the question is, why does the code of the two versions of MyClass
get compiled to two different opcodes? In f1
the binding is shadowing x
in the class scope, while in f2
it's binding a new name. If the MyClass
scopes were nested functions instead of classes, the y = x
line in f2
would be compiled the same, but the x = x
in f1
would be a LOAD_FAST
- this is because the compiler would know that x
is bound in the function, so it should use the LOAD_FAST
to retrieve a local variable. This would fail with an UnboundLocalError
when it was called.
In [28]: x = 'x in module'
def f3():
x = 'x in f2'
def myfunc():
x = 'x in myfunc'
def MyFunc():
x = x
print x
return MyFunc()
myfunc()
f3()
---------------------------------------------------------------------------
Traceback (most recent call last)
in ()
9 return MyFunc()
10 myfunc()
---> 11 f3()
in f3()
8 print x
9 return MyFunc()
---> 10 myfunc()
11 f3()
in myfunc()
7 x = x
8 print x
----> 9 return MyFunc()
10 myfunc()
11 f3()
in MyFunc()
5 x = 'x in myfunc'
6 def MyFunc():
----> 7 x = x
8 print x
9 return MyFunc()
UnboundLocalError: local variable 'x' referenced before assignment
This fails because the MyFunc
function then uses LOAD_FAST
:
In [31]: myfunc_code = f3.func_code.co_consts[2]
MyFunc_code = myfunc_code.co_consts[2]
In [33]: dis(MyFunc_code)
7 0 LOAD_FAST 0 (x)
3 STORE_FAST 0 (x)
8 6 LOAD_FAST 0 (x)
9 PRINT_ITEM
10 PRINT_NEWLINE
11 LOAD_CONST 0 (None)
14 RETURN_VALUE
(As an aside, it's not a big surprise that there should be a difference in how scoping interacts with code in the body of classes and code in a function. You can tell this because bindings at the class level aren't available in methods - method scopes aren't nested inside the class scope in the same way as nested functions are. You have to explicitly reach them via the class, or by using self.
(which will fall back to the class if there's not also an instance-level binding).)