Inaccurate Logarithm in Python

前端 未结 9 1945
甜味超标
甜味超标 2020-12-04 01:37

I work daily with Python 2.4 at my company. I used the versatile logarithm function \'log\' from the standard math library, and when I entered log(2**31, 2) it returned 31.0

相关标签:
9条回答
  • 2020-12-04 01:49

    The representation (float.__repr__) of a number in python tries to return a string of digits as close to the real value as possible when converted back, given that IEEE-754 arithmetic is precise up to a limit. In any case, if you printed the result, you wouldn't notice:

    >>> from math import log
    >>> log(2**31,2)
    31.000000000000004
    >>> print log(2**31,2)
    31.0
    

    print converts its arguments to strings (in this case, through the float.__str__ method), which caters for the inaccuracy by displaying less digits:

    >>> log(1000000,2)
    19.931568569324174
    >>> print log(1000000,2)
    19.9315685693
    >>> 1.0/10
    0.10000000000000001
    >>> print 1.0/10
    0.1
    

    usuallyuseless' answer is very useful, actually :)

    0 讨论(0)
  • 2020-12-04 01:57

    float are imprecise

    I don't buy that argument, because exact power of two are represented exactly on most platforms (with underlying IEEE 754 floating point).

    So if we really want that log2 of an exact power of 2 be exact, we can.
    I'll demonstrate it in Squeak Smalltalk, because it is easy to change the base system in that language, but the language does not really matter, floating point computation are universal, and Python object model is not that far from Smalltalk.

    For taking log in base n, there is the log: function defined in Number, which naively use the Neperian logarithm ln:

    log: aNumber 
        "Answer the log base aNumber of the receiver."
        ^self ln / aNumber ln
    

    self ln (take the neperian logarithm of receiver) , aNumber ln and / are three operations that will round there result to nearest Float, and these rounding error can cumulate... So the naive implementation is subject to the rounding error you observe, and I guess that Python implementation of log function is not much different.

    ((2 raisedTo: 31) log: 2) = 31.000000000000004
    

    But if I change the definition like this:

    log: aNumber 
        "Answer the log base aNumber of the receiver."
        aNumber = 2 ifTrue: [^self log2].
        ^self ln / aNumber ln
    

    provide a generic log2 in Number class:

    log2
        "Answer the base-2 log of the receiver."
        ^self asFloat log2
    

    and this refinment in Float class:

    log2
        "Answer the base 2 logarithm of the receiver.
        Care to answer exact result for exact power of two."
        ^self significand ln / Ln2 + self exponent asFloat
    

    where Ln2 is a constant (2 ln), then I effectively get an exact log2 for exact power of two, because significand of such number = 1.0 (including subnormal for Squeak exponent/significand definition), and 1.0 ln = 0.0.

    The implementation is quite trivial, and should translate without difficulty in Python (probably in the VM); the runtime cost is very cheap, so it's just a matter of how important we think this feature is, or is not.

    As I always say, the fact that floating point operations results are rounded to nearest (or whatever rounding direction) representable value is not a license to waste ulp. Exactness has a cost, both in term of runtime penalty and implementation complexity, so it's trade-offs driven.

    0 讨论(0)
  • 2020-12-04 02:00

    This is to be expected with computer arithmetic. It is following particular rules, such as IEEE 754, that probably don't match the math you learned in school.

    If this actually matters, use Python's decimal type.

    Example:

    from decimal import Decimal, Context
    ctx = Context(prec=20)
    two = Decimal(2)
    ctx.divide(ctx.power(two, Decimal(31)).ln(ctx), two.ln(ctx))
    
    0 讨论(0)
  • 2020-12-04 02:03

    You should read "What Every Computer Scientist Should Know About Floating-Point Arithmetic".

    http://docs.sun.com/source/806-3568/ncg_goldberg.html

    0 讨论(0)
  • 2020-12-04 02:07

    floating-point operations are never exact. They return a result which has an acceptable relative error, for the language/hardware infrastructure.

    In general, it's quite wrong to assume that floating-point operations are precise, especially with single-precision. "Accuracy problems" section from Wikipedia Floating point article :)

    0 讨论(0)
  • 2020-12-04 02:10

    This is normal. I would expect log10 to be more accurate then log(x, y), since it knows exactly what the base of the logarithm is, also there may be some hardware support for calculating base-10 logarithms.

    0 讨论(0)
提交回复
热议问题