Is the example of the descriptor protocol in the Python 3.6 documentation incorrect?

问题

I am new to Python and looking through its documentation I encountered the following example of the descriptor protocol that in my opinion is incorrect. .

It looks like

class IntField:
    def __get__(self, instance, owner):
        return instance.__dict__[self.name]

    def __set__(self, instance, value):
        if not isinstance(value, int):
            raise ValueError(f'expecting integer in {self.name}')
        instance.__dict__[self.name] = value

    # this is the new initializer:
    def __set_name__(self, owner, name):
        self.name = name

class Model:
    int_field = IntField()

Here are my considerations.

The attribute int_field is a class-wide attribute is not it?

So owner.__dict__ will have such a key. However in the method __get__ there is used instance.__dict__ that does not have that key. So from the point of view of using the method the method does not deal with the class-wide attribute but deals with an instance-wide attribute.

On the other hand, the method __set__ also does not deal with the class-wide attribute but creates an instance-wide attribute because there is used

instance.__dict__[self.name] = value

So it looks like each instance of the class creates its own instance-wide attribute. Moreover a reference to the class even is not passed to the method.

Am I right or do I have missed something that I do not know yet?

To make my considerations more clear the example logically is equivalent to the following

class MyClass:
    int_field = 10

instance_of = MyClass();

instance_of.__dict__["int_field"] = 20

print( MyClass.int_field )
print( instance_of.int_field )

The program output is

10
20

The instance attribute int_field has nothing common with the class-wide attribute int_field except its name.

The same is true for the example from the documentation. Intuitively one can expect that it is the class-wide attribute that is attached to the descriptor. However it is not true. The descriptor just borrows the name of the class-wide attribute. On the other hand, a class-wide attribute can indeed be attached to a descriptor.

So the example from the documentation in my opinion just confuses readers.

回答1:

The example is fine.

However in the method __get__ there is used instance.__dict__ that does not have that key.

There will be such a key once you actually set instance.int_field, which will invoke the property setter and assign the key.

On the other hand, the method __set__ also does not deal with the class-wide attribute but creates an instance-wide attribute

The setter isn't supposed to create a class attribute. It assigns to the instance dict key the getter is looking for.

回答2:

I think I can see what piece of the puzzle you're missing here: The self in the __get__ is the descriptor! That's why we set state in the instance.__dict__. It is not usual to set the state on self (i.e. on the descriptor), because then each instance of Model would be sharing that state.

There is nothing contradictory in the example, but the names used here are somewhat ambiguous. Perhaps some renaming will help:

class IntField:
    def __get__(self, obj, type_):
        # typically: self is an IntField(), obj is a Model(), type_ is Model
        return obj.__dict__[self.the_internal_name]

    def __set__(self, obj, value):
        if not isinstance(value, int):
            raise ValueError(f'expecting integer, but received {type(value)}')
        obj.__dict__[self.the_internal_name] = value

    def __set_name__(self, type_, name):
        # this is called at class definition time, i.e. descriptor init time
        self.the_internal_name = name + '_internal'

class Model:
    int_field = IntField()

I've also disambiguated the name on the class ('int_field'), the name on the descriptor ('the_internal_name') and the name which gets used in the instance's dict ('int_field_internal').

All the self in the code above refer to the descriptor instance. And there is one descriptor instance that handles attribute access for all instances of Model. The type_ (what the docs called owner) will be Model. The obj will be an instance, Model().

There is also an important piece of code that is missing from the example. It's usual to put some logic into the __get__, in order to allow access to the descriptor object itself:

def __get__(self, obj, type_=None):
    print(f'Descriptor was accessed with obj {obj} and type_ {type_}')
    if obj is None:
        # the descriptor was invoked on the class instead of an instance
        # returning self here allows 'class attribute' access!
        return self  
    return obj.__dict__[self.the_internal_name]

Let's try it out:

>>> m = Model()
>>> m.__dict__
{}
>>> m.int_field = 123
>>> m.__dict__
{'int_field_internal': 123}
>>> m.int_field
Descriptor was accessed with obj <__main__.Model object at 0x7fffe8186080> and type_ <class '__main__.Model'>
123
>>> Model.int_field
Descriptor was accessed with obj None and type_ <class '__main__.Model'>
<__main__.IntField at 0x7fffe8174748>
>>> Model.int_field.__dict__
Descriptor was accessed with obj None and type_ <class '__main__.Model'>
{'the_internal_name': 'int_field_internal'}

So, as you can see, the descriptor instance (which you may think of as a class attribute, if you want) handles attribute access for the int_field name on Model instances and also on Model itself. When accessed on an instance, it gets/sets state in the int_field_internal name on the instance. Since I've disambiguated the names, you could also access m.int_field_internal as a normal attribute get/set that same value bypassing the descriptor, if you wanted to (it's common practice for properties to do this, using an internal _name).

In general usage, we use the same name in the instance dict as the descriptor's "label" for it, because... why not? This shadows the normal attribute access to the instance name, because the descriptor protocol has priority. But you could still bypass the descriptor as shown below:

print(m.int_field)  # this access goes through the descriptor object
print(m.__dict__['int_field'])  # this is 'raw' access

See if you can guess what Model.int_field.the_internal_name will return, why, and try it out! Then try creating a new model like this:

class Model2:
    field_one = IntField()
    field_two = IntField()

And then you will see how __set_name__ is used. Descriptors allow you to make whatever you want happen during regular attribute access (be it get, set, or delete...). You can use it manage state on the class, the instance, or the descriptor itself - as you want - and that's why it can be confusing to explain, but that's also where the power and flexibility of this feature comes from.

来源：https://stackoverflow.com/questions/46609987/is-the-example-of-the-descriptor-protocol-in-the-python-3-6-documentation-incorr

标签

python

python-3.6

descriptor