Why is repr(int) faster than str(int)?

I am wondering why repr(int) is faster than str(int). With the following code snippet:

ROUNDS = 10000

def concat_strings_str():
    return ''.join(map(str, range(ROUNDS)))

def concat_strings_repr():
    return ''.join(map(repr, range(ROUNDS)))

%timeit concat_strings_str()
%timeit concat_strings_repr()

I get these timings (python 3.5.2, but very similar results with 2.7.12):

 1.9 ms ± 17.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
 1.38 ms ± 9.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

If I'm on the right path, the same function long_to_decimal_string is getting called below the hood.

Did I get something wrong or what else is going on that I am missing?

update: This probably has nothing to with int's __repr__ or __str__ methods but with the differences between repr() and str(), as int.__str__ and int.__repr__ are in fact comparably fast:

def concat_strings_str():
    return ''.join([one.__str__() for one in range(ROUNDS)])

def concat_strings_repr():
    return ''.join([one.__repr__() for one in range(ROUNDS)])

%timeit concat_strings_str()
%timeit concat_strings_repr()

results in:

2.02 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.05 ms ± 7.07 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Because using str(obj) must first go through type.__call__ then str.__new__ (create a new string) then PyObject_Str (make a string out of the object) which invokes int.__str__ and, finally, uses the function you linked.

repr(obj), which corresponds to builtin_repr, directly calls PyObject_Repr (get the object repr) which then calls int.__repr__ which uses the same function as int.__str__.

Additionally, the path they take through call_function (the function that handles the CALL_FUNCTION opcode that's generated for calls) is slightly different.

From the master branch on GitHub (CPython 3.7):

str goes through _PyObject_FastCallKeywords (which is the one that calls type.__call__). Apart from performing more checks, this also needs to create a tuple to hold the positional arguments (see _PyStack_AsTuple).
repr goes through _PyCFunction_FastCallKeywords which calls _PyMethodDef_RawFastCallKeywords. repr is also lucky because, since it only accepts a single argument (the switch leads it to the METH_0 case in _PyMethodDef_RawFastCallKeywords) there's no need to create a tuple, just indexing of the args.

As your update states, this isn't about int.__repr__ vs int.__str__, they are the same function after all; it's all about how repr and str reach them. str just needs to work a bit harder.

aristotll

I just compared the str and repr implementations in the 3.5 branch. See here.

There seems to be more checks in str:

There are several possibilities because the CPython functions that are responsible for the str and repr return are slightly different.

But I guess the primary reason is that str is a type (a class) and the str.__new__ method has to call __str__ while repr can directly go to __repr__.

来源：https://stackoverflow.com/questions/45376719/why-is-reprint-faster-than-strint

标签

python

performance

python-internals