How do libraries/programming languages convert floats to strings

最后都变了- 提交于 2021-02-10 03:19:24

问题


This is a mystery that I was trying to figure out when I was 15, but I failed. I still don't know the answer.

Here's a naive and flawed solution (like some other failed attempts I've seen here on Stack Overflow):

const numberToString = number => {
  let result = '';
  let multiplier = Math.floor(Math.log10(number));
  while (number > 0) {
    const currentDigit = Math.floor(number / 10 ** multiplier);
    if (multiplier === -1) result += '.';
    result += `${currentDigit}`;
    number -= 10 ** multiplier * currentDigit;
    multiplier -= 1;
  }

  if (multiplier >= 0) {
    result += Array(multiplier + 1)
      .fill('0')
      .join('');
  }
  return result;
};

numberToString(0.3) //.29999999999999998010382707025852380980776467160900842259699366886095386217478302201335914442574948883370288946713085380211028267974348864228883494754227105763273602317743416839701366257194448416238466245093684421946526875873398794558223163136792877759774069929483218021428696258138483228158055137040848084556063610493291767

The language here is in Javascript, but the question is language agnostic. However, feel free to improve the existing code if it's possible.

If the way this works is language dependent, I would appreciate some insights how this might look in various programming languages, for example Javascript.


回答1:


I am not a JAVASCRIPT coder so I stick with C++ ...

Converting number to string in decadic base is more complicated then using binary or its powers bases (bin,oct,hex) due to fact that all the numbers on usual computer are stored in binary not in decadic. Also it is not the same if you converting integer or fractional part. Let assume we have number x and want string s encoded in ASCII so this is how basic conversion work:

  1. Handle sign

    s="+";
    if (x<0.0)  { x=-x; s="-"; }
    

    as you can see its easy. Some number formats have a separate sign bit (usually the msb one) so in such case the code can be converted to bit operations for example 32 bit float:

    DWORD* dw=(DWORD*)(&x); // allow bit manipulation
    s="+";
    s[0]+=(((*dw)>>30)&2);  // ASCII +,- codes are 2 apart
    (*dw)&=0x7FFFFFFF;      // x=abs(x)
    

    so we have extracted sign character for our string and make x unsigned.

  2. Handle Integer part of x

    integer is converted to string by dividing it by the printing base so:

    y=floor(x); // integer part
    if (y)
     for (;y;) // until number is nonzero
     {
     s+='0'+(y%10); // works only for up to 10 base
     y/=10;
     }
    else s+='0'; // handle y=0 separately 
    

    so the remainder of each division is the wanted digit of the string but in reverse order. So after the conversion reverse the digits in string by a single for loop or you can store the number digits in reverse order directly. But for tat you need to know the number of digits of the integer part of the number. That is done by

    digits = ceil(log(y)/log(base)) + 1
    

    so for decadic:

    digits = ceil(log10(y)) + 1
    
  3. handle fractional part of x

    this is converted by multiplying by the conversion base.

    z=x-floor(x); // fractional part
    if (z)
     for (s+='.';z;) // until number is nonzero here you can limit to number of digits
     {
     z*=10.0;
     s+='0'+int(floor(z)); // works only for up to 10 base
     z-=floor(z);
     }
    

    this is returning the digits in their order so no reversing this time...

I encoded all the code directly in the SO editor so there might be hidden syntax errors.

Now usual print functions have also formatting which adds zero or space padding or cut off fractional digits above some value etc ...

If you have a bignum x then this will be much much slower because you can not handle basic +,-,*,/ operations as O(1) anymore and its usually faster to create hex string and convert the string to decadic on 8bit arithmetics instead or use biggest power of 10 that fits into used DATA WORD the bignum is stored with. The hex -> dec conversion can be done like this:

  • string hex2dec conversion on integer math

but again for very big strings it will be slow. In such case it can be speed up by using FFT/NTT approaches similar to Schönhage-Strassen multiplication but I never tried to use it for printing before so I lack any insights on such approach.

Also beware that determining the number of digits of a value is not regular for fractional part of number (see the link above) so you need to mind that you can be off by 1-2 digits.

[Edit1] rounding the string

simply if you detect n consequent zeros or nines in the fractional part (after any nonzero digit) you need to stop printing and round. Zeros are just cut of and nines you need to cut off too and increment the rest by one in the string. Such operation might overflow to 1 digit not present in the string so in such case just insert 1.

When I put all together I come up with this C++/VCL code (based on VCL AnsiString data type):

AnsiString print(double x)
    {
    char c;
    int i,j;
    double y,a;
    AnsiString s;

    const int B=10;                 // chose base 2...16
    const double b=B;               // base
    const double _b=1.0/b;          // 1/base
    const char digit[16]="0123456789ABCDEF";

    #define _enable_rounding

    #ifdef _enable_rounding
    const int round_digits=5;       // min consequent 0s or B-1s to triger rounding
    int cnt0=0,cnt1=0;              // consequent digit counters
    int ena=0;                      // enabled consequent digit counters? after first nonzero digit
    #endif

    // here you should handle NaN and Inf cases

    // handle sign
    s="+";
    if (x<0.0)  { x=-x; s="-"; }
    // integer part
    y=floor(x);
    if (y) for (;y>0.0;)        // until number is nonzero
        {
        a=y; y=floor(y*_b);     // the same as y/=10 on integers
        a-=y*b;                 // the same as a=y%10 on integers
        i=int(a);
        s+=digit[i];
        #ifdef _enable_rounding
        ena|=i;
        #endif
        }
    else s+='0';                // handle y=0 separately
    // reverse string skipping +/- sign (beware AnsiString is indexed from 1 up to its length included!!!)
    for (i=2,j=s.Length();i<j;i++,j--){ c=s[i]; s[i]=s[j]; s[j]=c; }
    // fractional part
    y=x-floor(x);
    if (y) for (s+='.';y>0.0;)      // until number is nonzero here you can limit to number of digits
        {
        y*=b;
        a=floor(y);
        y-=a;
        i=int(a);
        s+=digit[i];
        #ifdef _enable_rounding
        ena|=i;
        // detect consequent rounding digits
        if (ena)
            {
                 if (i==  0){ cnt0++; cnt1=0; }
            else if (i==B-1){ cnt1++; cnt0=0; }
            else            { cnt0=0; cnt1=0; }
            }
        // round down .???00000000 by cut of zeros
        if (cnt0>=round_digits)
            {
            s=s.SubString(1,s.Length()-cnt0);       // by cut of zeros
            break;
            }
        // round up .???999999999 by increment and cut of zeros (only base 10) !!!
        if (cnt1>=round_digits)
            {
            s=s.SubString(1,s.Length()-cnt1);       // cut off nines
            for (j=1,i=s.Length();(i>=2)&&(j);i--)
                {
                c=s[i];
                if (c=='.') continue;
                if (c=='9'){ s[i]='0'; continue; }
                j=0; s[i]++;
                }
            if (j) s=s.Insert("1",i+1); // overflow -> insert "1" after sign
            if (s[s.Length()]=='.')     // cut off decimal point if no fractional part left
             s=s.SubString(1,s.Length()-1);
            break;
            }
        #endif
        }
    return s;
    }

You can select base B=<2,16>. You can enable disable the rounding by using/commenting the #define _enable_rounding. Beware the rounding routine works only for base 10 as for different bases the increment routine would have a bit different code/constants and too lazy to do it universally (it would be longer and less comprehend-able code). The round_digits constant is a threshold of how many consequent zeros or nines are triggering the rounding.




回答2:


(I don't have enough reputation to comment, so am resorting to using an answer...)

I notice that you ran the precision out to 300+ digits, well beyond the precision of the floating point numbers, hence the imprecise result. If you're looking for a means to have high precision calculations, you can probably resort to BigInt, and scale up the numbers accordingly. (I say "probably", because BigInt can be coerced into fixed precision calculations, not floating point, and thus depending on one's goal, BigInt might not meet the requirement.)

Eg, calculating 1000 / 17 to 100 significant digits can be handled via the following function, which essentially scales up 1000 and 17 to ensure 100 significant digits. (Note that this is just a concept function for handling high precision division between two integers, but can be the basis for non-integers by scaling up the dividend and divisor until they're integers, and adjusting digits accordingly. Plus you might need to fudge in some extra "hidden" digits of precision to handle rounding)...

function divideN(dividend, divisor, digits) {
  dividend = dividend * 10n ** (BigInt(digits) * 2n);
  divisor = divisor * 10n ** BigInt(digits);
  var s = (dividend/divisor).toString();
  if (s.length < digits) {
    s = "0".repeat(digits - s.length) + s;
  }
  s = s.slice(0, s.length - digits) + "." + s.slice(-digits);
  return s;
}

BigInt requires numbers to end in "n", so the function needs to be called as follows...

divideN(1000n,17n,100)

...which in this case returns...

"58.8235294117647058823529411764705882352941176470588235294117647058823529411764705882352941176470588235"

Note in this case, 102 digits of precision are returned rather than 100 because of the relative size of the dividend (1000) to the divisor (17).



来源:https://stackoverflow.com/questions/56408934/how-do-libraries-programming-languages-convert-floats-to-strings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!