How do languages such as Python overcome C's Integral data limits?

问题

While doing some random experimentation with a factorial program in C, Python and Scheme. I came across this fact:

In C, using 'unsigned long long' data type, the largest factorial I can print is of 65. which is '9223372036854775808' that is 19 digits as specified here.

In Python, I can find the factorial of a number as large as 999 which consists of a large number of digits, much more than 19.

How does CPython achieve this? Does it use a data type like 'octaword' ?

I might be missing some fundamental facts here. So, I would appreciate some insights and/or references to read. Thanks!

UPDATE: Thank you all for the explanation. Does that means, CPython is using the GNU Multi-precision library (or some other similar library)?

UPDATE 2: I am looking for Python's 'bignum' implementation in the sources. Where exactly it is? Its here at http://svn.python.org/view/python/trunk/Objects/longobject.c?view=markup. Thanks Baishampayan.

回答1:

It's called Arbitrary Precision Arithmetic. There's more here: http://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic

回答2:

Looking at the Python source code, it seems the long type (at least in pre-Python 3 code) is defined in longintrepr.h like this -

/* Long integer representation.
   The absolute value of a number is equal to
    SUM(for i=0 through abs(ob_size)-1) ob_digit[i] * 2**(SHIFT*i)
   Negative numbers are represented with ob_size < 0;
   zero is represented by ob_size == 0.
   In a normalized number, ob_digit[abs(ob_size)-1] (the most significant
   digit) is never zero.  Also, in all cases, for all valid i,
    0 <= ob_digit[i] <= MASK.
   The allocation function takes care of allocating extra memory
   so that ob_digit[0] ... ob_digit[abs(ob_size)-1] are actually available.

   CAUTION:  Generic code manipulating subtypes of PyVarObject has to
   aware that longs abuse  ob_size's sign bit.
*/

struct _longobject {
    PyObject_VAR_HEAD
    digit ob_digit[1];
};

The actual usable interface of the long type is then defined in longobject.h by creating a new type PyLongObject like this -

typedef struct _longobject PyLongObject;

And so on.

There is more stuff happening inside longobject.c, you can take a look at those for more details.

回答3:

Data types such as int in C are directly mapped (more or less) to the data types supported by the processor. So the limits on C's int are essentially the limits imposed by the processor hardware.

But one can implement one's own int data type entirely in software. You can for example use an array of digits as your underlying representation. May be like this:

class MyInt {
    private int [] digits;
    public MyInt(int noOfDigits) {
       digits = new int[noOfDigits];
    }
}

Once you do that you may use this class and store integers containing as many digits as you want, as long as you don't run out memory.

Perhaps Python is doing something like this inside its virtual machine. You may want to read this article on Arbitrary Precision Arithmetic to get the details.

回答4:

Not octaword. It implemented bignum structure to store arbitary-precision numbers.

回答5:

Python assigns to long integers (all ints in Python 3) just as much space as they need -- an array of "digits" (base being a power of 2) allocated as needed.

来源：https://stackoverflow.com/questions/867393/how-do-languages-such-as-python-overcome-cs-integral-data-limits

标签

python

types

integer