Why do strings in python 2.7 not have the “__iter__” attribute, but strings in python 3.7 have the “__iter__” attribute

问题

In both python 2.7 and python 3.7 I am able to do the following:

string = "hello world"
for letter in string:
    print(letter)

But if I check for the iterable attribute:

python 2.7

hasattr(string, "__iter__")
>> False

python 3.7

hasattr(string, "__iter__")
>> True

This is something that I have not found to be documented when making a 2to3 migration, and it has caused a little headache because I have hasattr(entity, "__iter__") checks in my code. In python 2.7 this works to differentiate a string from a collection, but not in python 3.7

Really I am asking, what was the design decision, why was it implemented like this. I hope that's not too speculative.

回答1:

The str_iterator class for str (run type(iter("str"))) was implemented in CPython3. The object of which is returned when called "str".__iter__() or iter("str"). In CPython2 own iterator' class for str wasn't implemented. When you call iter("str") you get instance of base iterator class.

See 1 piece of code. f = t->tp_iter; - return custom iterator if exist. Otherwise PySeqIter_New(o) - return instance of seqiterobject structure (see 2 piece of code). As you can see from the 3 piece of code this basic iterator calls PySequence_GetItem(seq, it->it_index); on iteration.

If you implement your class that support iteration than you need to define either __iter__ method or __getitem__. If you choose the first option then the returned object must implemented __iter__ and __next__ (CPython3) or next (CPython2) methods.

Bad idea to check for hasattr(entity, "__iter__").

If you need to check if the object is iterable, then run isinstance(entity, Iterable).

If you need to exclude only str then run isinstance(entity, Iterable) and not isinstance(entity, str) (CPython3).

1

PyObject *
PyObject_GetIter(PyObject *o)
{
    PyTypeObject *t = o->ob_type;
    getiterfunc f;

    f = t->tp_iter;
    if (f == NULL) {
         if (PySequence_Check(o)) return PySeqIter_New(o);
         return type_error("'%.200s' object is not iterable", o);
    }
    else {
        ...
    }
}

2

typedef struct {
    PyObject_HEAD
    Py_ssize_t it_index;
    PyObject *it_seq; /* Set to NULL when iterator is exhausted */
} seqiterobject;

PyObject *
PySeqIter_New(PyObject *seq)
{
    seqiterobject *it;

    if (!PySequence_Check(seq)) {
        PyErr_BadInternalCall();
        return NULL;
    }
    it = PyObject_GC_New(seqiterobject, &PySeqIter_Type);
    if (it == NULL)
        return NULL;
    it->it_index = 0;
    Py_INCREF(seq);
    it->it_seq = seq;
    _PyObject_GC_TRACK(it);
    return (PyObject *)it;
}

3

static PyObject *
iter_iternext(PyObject *iterator)
{
    seqiterobject *it;
    PyObject *seq;
    PyObject *result;

    assert(PySeqIter_Check(iterator));
    it = (seqiterobject *)iterator;
    seq = it->it_seq;
    if (seq == NULL)
        return NULL;
    if (it->it_index == PY_SSIZE_T_MAX) {
        PyErr_SetString(PyExc_OverflowError, "iter index too large");
        return NULL;
    }

    result = PySequence_GetItem(seq, it->it_index);
    if (result != NULL) {
        it->it_index++;
        return result;
    }
    if (PyErr_ExceptionMatches(PyExc_IndexError) ||
        PyErr_ExceptionMatches(PyExc_StopIteration))
    {
        PyErr_Clear();
        it->it_seq = NULL;
        Py_DECREF(seq);
    }
    return NULL;
}

来源：https://stackoverflow.com/questions/58328946/why-do-strings-in-python-2-7-not-have-the-iter-attribute-but-strings-in-p

标签

python

python-3.x

python-2.7