What uncommon floating-point sizes exist in C++ compilers?

后端未结
关注
 3  538
谎友^ 2021-01-14 17:59
The C++14 draft standard seems rather quiet about the specific requirements for float, double and long double, although these sizes seem to be common:

      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   庸人自扰
                                             
                
                
                (楼主)
            
              
              
                2021-01-14 18:29
              

            
            
                        
If you're only asking about size in bits then odd-sized types only exist in some older platforms that don't use 8-bit (or another power of 2) bytes like the Unisys ClearPath Dorado Servers with 36-bit float and 72-bit double. That beast is still even in active development until now. The last version was in 2018. Mainframes and servers live a very long life so you can still see some PDP-10 and other architectures in use in modern times, with modern compiler support
If you care about the formats then there are lots of standard compliant 32, 64 and 128-bit floating-point formats that aren't IEEE-754 like the hex and decimal floating point types in IBM z, Cray formats and VAX formats. In fact IBM z is one of the very rare modern platforms with decimal float hardware, although if you use GCC and some other compilers you can use their built-in software support for decimal float. IBM also uses the special double-double format which is still the default for long double on PowerPC until now
There are also some other non-standard 24-bit floats in a few modern C/C++ compilers for microcontrollers
Here's the summary of most of the available floating-point formats. See also Do any real-world CPUs not use IEEE 754?. For more information continue to the next section

Types in C++ are generally mapped to hardware types for performance reasons. Therefore floating-point types will be whatever available on the CPU if it ever has an FPU. In modern computers IEEE-754 is the dominant format in hardware, and due to the requirements in C++ standard float and double must be mapped to at least IEEE-754 single and double precision respectively
Hardware support for types with higher precision is not common except on x86 and a few other rare platforms with 80-bit extended precision, therefore long double is usually mapped to the same type as double on those platforms. However recently long double is being slowly migrated to IEEE-754 quadruple precision in many compilers like GCC or Clang. Since that one is implemented with the built-in software library, performance is a lot worse. Depending on whether you favor faster execution or higher precision you're still free to choose whatever type long double maps to though. For example on x86 GCC has -mlong-double-64/80/128 and -m96/128bit-long-double options to set the padding and format of long double. The option is also available in many other architectures like the S/390 and zSeries
PowerPC OTOH by default uses a completely different 128-bit long double format implemented using double-double arithmetic and has the same range as IEEE-754 double precision. Its precision is slightly lower than quadruple precision but it's a lot faster because it can utilize the hardware double arithmetic. As above, you can choose between the 2 formats with the -mabi=ibmlongdouble/ieeelongdouble options. That trick is also used in some platforms where only 32-bit float is supported to get near-double precision
IBM z mainframes traditionally use IBM hex float formats and they still use it nowadays. But they do also support IEEE-754 binary and decimal floating-point types in addition to that

The format of floating-point numbers can be either base 16 S/390® hexadecimal format, base 2 IEEE-754 binary format, or base 10 IEEE-754 decimal format. The formats are based on three operand lengths for hexadecimal and binary: short (32 bits), long (64 bits), and extended (128 bits). The formats are also based on three operand lengths for decimal: _Decimal32 (32 bits), _Decimal64 (64 bits), and _Decimal128 (128 bits).
Floating-point numbers

Other architectures may have other floating-point formats, like VAX or Cray. However since those mainframes are still being used, their newer hardware version also include support for IEEE-754 just like how IBM did with their mainframes
On modern platforms without FPU the floating-point types are usually IEEE-754 single and double precision for better interoperability and library support. However on 8-bit microcontrollers even single precision is too costly, therefore some compilers support a non-standard mode where float is a 24-bit type. For example the XC8 compiler uses a 24-bit floating-point format that is a truncated form of the 32-bit format, and NXP's MRK uses a different 24-bit float format
Due to the rise of graphics and AI applications that require a narrower floating-point type, 16-bit float formats like IEEE-754 binary16 and Google's bfloat16 are also introduced to in many platforms and compilers also have some limited support for them, like __fp16 in GCC
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复