How the methods invocation works in Python? I mean, how the python virtual machine interpret it.
It\'s true that the python method resolution could be slower in Pyth
Names (methods, functions, variables) are all resolved by looking at the namespace. Namespaces are implemented in CPython as dict
s (hash maps).
When a name is not found in the instance namespace (dict
), python goes for the class, and then for the base classes, following the method resolution order (MRO).
All resolving is made at runtime.
You can play around with the dis module to see how that happens in bytecode.
Simple example:
import dis
a = 1
class X(object):
def method1(self):
return 15
def test_namespace(b=None):
x = X()
x.method1()
print a
print b
dis.dis(test_namespace)
That prints:
9 0 LOAD_GLOBAL 0 (X)
3 CALL_FUNCTION 0
6 STORE_FAST 1 (x)
10 9 LOAD_FAST 1 (x)
12 LOAD_ATTR 1 (method1)
15 CALL_FUNCTION 0
18 POP_TOP
11 19 LOAD_GLOBAL 2 (a)
22 PRINT_ITEM
23 PRINT_NEWLINE
12 24 LOAD_FAST 0 (b)
27 PRINT_ITEM
28 PRINT_NEWLINE
29 LOAD_CONST 0 (None)
32 RETURN_VALUE
All LOAD
s are namespace lookups.
It's true that the python method resolution could be slower in Python that in Java. What is late binding?
Late binding describes a strategy of how an interpreter or compiler of a particular language decides how to map an identifier to a piece of code. For example, consider writing obj.Foo()
in C#. When you compile this, the compiler tries to find the referenced object and insert a reference to the location of the Foo
method that will be invoked at runtime. All of this method resolution happens at compile time; we say that names are bound "early".
By contrast, Python binds names "late". Method resolution happens at run time: the interpreter simply tries to find the referenced Foo
method with the right signature, and if it's not there, a runtime error occurs.
What are the differences on the reflection mechanism in these two languages?
Dynamic languages tend to have better reflection facilities than static languages, and Python is very powerful in this respect. Still, Java has pretty extensive ways to get at the internals of classes and methods. Nevertheless, you can't get around the verbosity of Java; you'll write much more code to do the same thing in Java than you would in Python. See the java.lang.reflect API.
Method invocation in Python consists of two distinct separable steps. First an attribute lookup is done, then the result of that lookup is invoked. This means that the following two snippets have the same semantics:
foo.bar()
method = foo.bar
method()
Attribute lookup in Python is a rather complex process. Say we are looking up attribute named attr on object obj, meaning the following expression in Python code: obj.attr
First obj's instance dictionary is searched for "attr", then the instance dictionary of the class of obj and the dictionaries of its parent classes are searched in method resolution order for "attr".
Normally if a value is found on the instance, that is returned. But if the lookup on the class results in a value that has both the __get__ and __set__ methods (to be exact, if a dictionary lookup on the values class and parent classes has values for both those keys) then the class attribute is regarded as something called a "data descriptor". This means that the __get__ method on that value is called, passing in the object on which the lookup occurred and the result of that value is returned. If the class attribute isn't found or isn't a data descriptor the value from the instances dictionary is returned.
If there is no value in the instance dictionary, then the value from the class lookup is returned. Unless it happens to be a "non-data descriptor", i.e. it has the __get__ method. Then the __get__ method is invoked and the resulting value returned.
There is one more special case, if the obj happens to be a class, (an instance of the type type), then the instance value is also checked if it's a descriptor and invoked accordingly.
If no value is found on the instance nor its class hierarchy, and the obj's class has a __getattr__ method, that method is called.
The following shows the algorithm as encoded in Python, effectively doing what the getattr() function would do. (excluding any bugs that have slipped in)
NotFound = object() # A singleton to signify not found values
def lookup_attribute(obj, attr):
class_attr_value = lookup_attr_on_class(obj, attr)
if is_data_descriptor(class_attr_value):
return invoke_descriptor(class_attr_value, obj, obj.__class__)
if attr in obj.__dict__:
instance_attr_value = obj.__dict__[attr]
if isinstance(obj, type) and is_descriptor(instance_attr_value):
return invoke_descriptor(instance_attr_value, None, obj)
return instance_attr_value
if class_attr_value is NotFound:
getattr_method = lookup_attr_on_class(obj, '__getattr__')
if getattr_method is NotFound:
raise AttributeError()
return getattr_method(obj, attr)
if is_descriptor(class_attr_value):
return invoke_descriptor(class_attr_value, obj, obj.__class__)
return class_attr_value
def lookup_attr_on_class(obj, attr):
for parent_class in obj.__class__.__mro__:
if attr in parent_class.__dict__:
return parent_class.__dict__[attr]
return NotFound
def is_descriptor(obj):
if lookup_attr_on_class(obj, '__get__') is NotFound:
return False
return True
def is_data_descriptor(obj):
if not is_descriptor(obj) or lookup_attr_on_class(obj, '__set__') is NotFound :
return False
return True
def invoke_descriptor(descriptor, obj, cls):
descriptormethod = lookup_attr_on_class(descriptor, '__get__')
return descriptormethod(descriptor, obj, cls)
What does all this descriptor nonsense have to with method invocation you ask? Well the thing is, that functions are also objects, and they happen to implement the descriptor protocol. If the attribute lookup finds a function object on the class, it's __get__ methods gets called and returns a "bound method" object. A bound method is just a small wrapper around the function object that stores the object that the function was looked up on, and when invoked, prepends that object to the argument list (where usually for functions that are meant to methods the self argument is).
Here's some illustrative code:
class Function(object):
def __get__(self, obj, cls):
return BoundMethod(obj, cls, self.func)
# Init and call added so that it would work as a function
# decorator if you'd like to experiment with it yourself
def __init__(self, the_actual_implementation):
self.func = the_actual_implementation
def __call__(self, *args, **kwargs):
return self.func(*args, **kwargs)
class BoundMethod(object):
def __init__(self, obj, cls, func):
self.obj, self.cls, self.func = obj, cls, func
def __call__(self, *args, **kwargs):
if self.obj is not None:
return self.func(self.obj, *args, **kwargs)
elif isinstance(args[0], self.cls):
return self.func(*args, **kwargs)
raise TypeError("Unbound method expects an instance of %s as first arg" % self.cls)
For method resolution order (which in Python's case actually means attribute resolution order) Python uses the C3 algorithm from Dylan. It is too complicated to explain here, so if you're interested see this article. Unless you are doing some really funky inheritance hierarchies (and you shouldn't), it is enough to know that the lookup order is left to right, depth first, and all subclasses of a class are searched before that class is searched.