I am aware of a similar question, but I want to ask for people opinion on my algorithm to sum floating point numbers as accurately as possible with practical costs.
My guess is that your binary decomposition will work almost as well as Kahan summation.
Here is an example to illustrate it:
#include
#include
#include
void sumpair( float *a, float *b)
{
volatile float sum = *a + *b;
volatile float small = sum - std::max(*a,*b);
volatile float residue = std::min(*a,*b) - small;
*a = sum;
*b = residue;
}
void sumpairs( float *a,size_t size, size_t stride)
{
if (size <= stride*2 ) {
if( stride
I declared my operands volatile and compiled with -ffloat-store to avoid extra precision on x86 architecture
g++ -ffloat-store -Wl,-stack_size,0x20000000 test_sum.c
and get: (0.03125 is 1ULP)
naive sum=-373226.25
dble prec sum=-373223.03
1st approx sum=-373223
2nd approx sum=-373223.06
3rd approx sum=-373223.06
This deserve a little explanation.