I work daily with Python 2.4 at my company. I used the versatile logarithm function \'log\' from the standard math library, and when I entered log(2**31, 2) it returned 31.0
The repr
esentation (float.__repr__
) of a number in python tries to return a string of digits as close to the real value as possible when converted back, given that IEEE-754 arithmetic is precise up to a limit. In any case, if you print
ed the result, you wouldn't notice:
>>> from math import log
>>> log(2**31,2)
31.000000000000004
>>> print log(2**31,2)
31.0
print
converts its arguments to strings (in this case, through the float.__str__
method), which caters for the inaccuracy by displaying less digits:
>>> log(1000000,2)
19.931568569324174
>>> print log(1000000,2)
19.9315685693
>>> 1.0/10
0.10000000000000001
>>> print 1.0/10
0.1
usuallyuseless' answer is very useful, actually :)
float are imprecise
I don't buy that argument, because exact power of two are represented exactly on most platforms (with underlying IEEE 754 floating point).
So if we really want that log2 of an exact power of 2 be exact, we can.
I'll demonstrate it in Squeak Smalltalk, because it is easy to change the base system in that language, but the language does not really matter, floating point computation are universal, and Python object model is not that far from Smalltalk.
For taking log in base n, there is the log: function defined in Number, which naively use the Neperian logarithm ln
:
log: aNumber
"Answer the log base aNumber of the receiver."
^self ln / aNumber ln
self ln
(take the neperian logarithm of receiver) , aNumber ln
and /
are three operations that will round there result to nearest Float, and these rounding error can cumulate... So the naive implementation is subject to the rounding error you observe, and I guess that Python implementation of log function is not much different.
((2 raisedTo: 31) log: 2) = 31.000000000000004
But if I change the definition like this:
log: aNumber
"Answer the log base aNumber of the receiver."
aNumber = 2 ifTrue: [^self log2].
^self ln / aNumber ln
provide a generic log2 in Number class:
log2
"Answer the base-2 log of the receiver."
^self asFloat log2
and this refinment in Float class:
log2
"Answer the base 2 logarithm of the receiver.
Care to answer exact result for exact power of two."
^self significand ln / Ln2 + self exponent asFloat
where Ln2
is a constant (2 ln), then I effectively get an exact log2 for exact power of two, because significand of such number = 1.0 (including subnormal for Squeak exponent/significand definition), and 1.0 ln = 0.0
.
The implementation is quite trivial, and should translate without difficulty in Python (probably in the VM); the runtime cost is very cheap, so it's just a matter of how important we think this feature is, or is not.
As I always say, the fact that floating point operations results are rounded to nearest (or whatever rounding direction) representable value is not a license to waste ulp. Exactness has a cost, both in term of runtime penalty and implementation complexity, so it's trade-offs driven.
This is to be expected with computer arithmetic. It is following particular rules, such as IEEE 754, that probably don't match the math you learned in school.
If this actually matters, use Python's decimal type.
Example:
from decimal import Decimal, Context
ctx = Context(prec=20)
two = Decimal(2)
ctx.divide(ctx.power(two, Decimal(31)).ln(ctx), two.ln(ctx))
You should read "What Every Computer Scientist Should Know About Floating-Point Arithmetic".
http://docs.sun.com/source/806-3568/ncg_goldberg.html
floating-point operations are never exact. They return a result which has an acceptable relative error, for the language/hardware infrastructure.
In general, it's quite wrong to assume that floating-point operations are precise, especially with single-precision. "Accuracy problems" section from Wikipedia Floating point article :)
This is normal. I would expect log10 to be more accurate then log(x, y), since it knows exactly what the base of the logarithm is, also there may be some hardware support for calculating base-10 logarithms.