I am wondering why repr(int)
is faster than str(int)
. With the following code snippet:
ROUNDS = 10000
def concat_strings_str():
return ''.join(map(str, range(ROUNDS)))
def concat_strings_repr():
return ''.join(map(repr, range(ROUNDS)))
%timeit concat_strings_str()
%timeit concat_strings_repr()
I get these timings (python 3.5.2, but very similar results with 2.7.12):
1.9 ms ± 17.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.38 ms ± 9.07 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
If I'm on the right path, the same function long_to_decimal_string
is getting called below the hood.
Did I get something wrong or what else is going on that I am missing?
update:
This probably has nothing to with int
's __repr__
or __str__
methods but with the differences between repr()
and str()
, as int.__str__
and int.__repr__
are in fact comparably fast:
def concat_strings_str():
return ''.join([one.__str__() for one in range(ROUNDS)])
def concat_strings_repr():
return ''.join([one.__repr__() for one in range(ROUNDS)])
%timeit concat_strings_str()
%timeit concat_strings_repr()
results in:
2.02 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.05 ms ± 7.07 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Because using str(obj)
must first go through type.__call__
then str.__new__
(create a new string) then PyObject_Str
(make a string out of the object) which invokes int.__str__
and, finally, uses the function you linked.
repr(obj)
, which corresponds to builtin_repr
, directly calls PyObject_Repr
(get the object repr) which then calls int.__repr__
which uses the same function as int.__str__
.
Additionally, the path they take through call_function
(the function that handles the CALL_FUNCTION
opcode that's generated for calls) is slightly different.
From the master branch on GitHub (CPython 3.7):
str
goes through_PyObject_FastCallKeywords
(which is the one that callstype.__call__
). Apart from performing more checks, this also needs to create a tuple to hold the positional arguments (see_PyStack_AsTuple
).repr
goes through_PyCFunction_FastCallKeywords
which calls_PyMethodDef_RawFastCallKeywords
.repr
is also lucky because, since it only accepts a single argument (the switch leads it to theMETH_0
case in_PyMethodDef_RawFastCallKeywords
) there's no need to create a tuple, just indexing of the args.
As your update states, this isn't about int.__repr__
vs int.__str__
, they are the same function after all; it's all about how repr
and str
reach them. str
just needs to work a bit harder.
I just compared the str
and repr
implementations in the 3.5 branch.
See here.
There are several possibilities because the CPython functions that are responsible for the str
and repr
return are slightly different.
But I guess the primary reason is that str
is a type
(a class) and the str.__new__
method has to call __str__
while repr
can directly go to __repr__
.
来源:https://stackoverflow.com/questions/45376719/why-is-reprint-faster-than-strint