In C++ (or maybe only our compilers VC8 and VC10) 3.14 is a double literal and 3.14f is a float literal.
Now I have a colleague
I did a test.
I compiled this code:
float f1(float x) { return x*3.14; }
float f2(float x) { return x*3.14F; }
Using gcc 4.5.1 for i686 with optimization -O2.
This was the assembly code generated for f1:
pushl %ebp
movl %esp, %ebp
subl $4, %esp # Allocate 4 bytes on the stack
fldl .LC0 # Load a double-precision floating point constant
fmuls 8(%ebp) # Multiply by parameter
fstps -4(%ebp) # Store single-precision result on the stack
flds -4(%ebp) # Load single-precision result from the stack
leave
ret
And this is the assembly code generated for f2:
pushl %ebp
flds .LC2 # Load a single-precision floating point constant
movl %esp, %ebp
fmuls 8(%ebp) # Multiply by parameter
popl %ebp
ret
So the interesting thing is that for f1, the compiler stored the value and re-loaded it just to make sure that the result was truncated to single-precision.
If we use the -ffast-math option, then this difference is significantly reduced:
pushl %ebp
fldl .LC0 # Load double-precision constant
movl %esp, %ebp
fmuls 8(%ebp) # multiply by parameter
popl %ebp
ret
pushl %ebp
flds .LC2 # Load single-precision constant
movl %esp, %ebp
fmuls 8(%ebp) # multiply by parameter
popl %ebp
ret
But there is still the difference between loading a single or double precision constant.
These are the results with gcc 5.2.1 for x86-64 with optimization -O2:
f1:
cvtss2sd %xmm0, %xmm0 # Convert arg to double precision
mulsd .LC0(%rip), %xmm0 # Double-precision multiply
cvtsd2ss %xmm0, %xmm0 # Convert to single-precision
ret
f2:
mulss .LC2(%rip), %xmm0 # Single-precision multiply
ret
With -ffast-math, the results are the same.