Why is __muldc3 called, when two std::complex are multiplied?

情到浓时终转凉″ 提交于 2019-12-01 20:41:41

问题


I naively assumed, that the complex number multiplication would be inlined by the compiler, for example for this function:

#include <complex>

void mult(std::complex<double> &a, std::complex<double> &b){
    a*=b;
}

However, when complied by gcc (with -O2), the resulting assembler is surprising (at least for me):

mult(std::complex<double>&, std::complex<double>&):
        pushq   %rbx
        movsd   8(%rdi), %xmm3
        movsd   (%rdi), %xmm2
        movq    %rdi, %rbx
        movsd   8(%rsi), %xmm1
        movsd   (%rsi), %xmm0
        call    __muldc3
        movsd   %xmm0, (%rbx)
        movsd   %xmm1, 8(%rbx)
        popq    %rbx
        ret

There is a call to this function __multdc3, which somehow replaced the call to the operator*= (its mangled name would be _ZNSt7complexIdEmLIdEERS0_RKS_IT_E and the complex number would be passed per reference).

However, there seems to be nothing special in the implementation of the operator*= which would explain the magic:

// 26.2.5/13
  // XXX: This is a grammar school implementation.
  template<typename _Tp>
    template<typename _Up>
    complex<_Tp>&
    complex<_Tp>::operator*=(const complex<_Up>& __z)
    {
      const _Tp __r = _M_real * __z.real() - _M_imag * __z.imag();
      _M_imag = _M_real * __z.imag() + _M_imag * __z.real();
      _M_real = __r;
      return *this;
}

I must be missing something, thus my question: What is the reason for the resulting assembler?


回答1:


You should note that it is, strictly speaking, "wrong" to implement the complex floating point multiplication by the formula

(a+i*b)*(c + i*d) = a*c - b*d + i*(b*c + a*d)

I write wrong in quotes, because the C++ standard does not actually require correct implementation. C does specify it in the some appendix.

The simple implementation does not give correct results with Inf and/or NaN in the input.

consider (Inf + 0*i)*(Inf + 0*i): Clearly, for consistent behavior, the result should be the same as for real floating point, namely Inf, or (Inf + 0*i), respectively. However, the formula above gives Inf + i*NaN.

So I could imagine that __muldc3 is a longer function that handles these cases correctly.

When the call goes away with -ffast-math then this is most likely the explanation.




回答2:


At least for plain C, gcc follows the somewhat crazy C99 annex f (?) rules for complex multiply/divide; perhaps for c++ as well. Try -fast-math or -fcx-fortran-rules.




回答3:


The answer of Andreas H. give the direction, in this answer I just try to connect the dots.

First, I was wrong about the definition of the operator*=: there is a specialization for floating types, for example for doubles this operator defined as:

  template<typename _Tp>
    complex&
    operator*=(const complex<_Tp>& __z)
{
  _ComplexT __t;
  __real__ __t = __z.real();
  __imag__ __t = __z.imag();
  _M_value *= __t;
  return *this;
}

With _ComplexT defined as

typedef __complex__ double _ComplexT;

and _M_value defined as

private:
   _ComplexT _M_value;

That means, the school-formula is used for such types as std::complex<int> but for the floating types it boils down to C-multiplication (__complex__ is a gcc-extension), which is a part of the standard since C99, appendix G (most important parts are at the end of the answer).

One of the problematic cases would be:

(0.0+1.0*j)*(inf+inf*j) = (0.0*inf-1*inf)+(0.0*inf+1.0*inf)j
                        =  nan + nan*j

Due to the convention (G.3.1) the first factor is a nonzero, finite and the second is infinite, thus due to (G.5.1.4) the result must be finite, which is not the case for nan+nan*j.

The strategy of the calculation in __multdc3 is simple: use the school formula, and if this doesn't work (both are nan) a more complicated approach will be used - which is just too complex to be inlined. (By the way, clang inlines the school formula and calls __multdc3 if this formula didn't work).

Another problem could be that one product overflow/underflow but the sum of two products doesn't. This is probably covered by the formulation "may raise spurious floating-point exceptions", but I'm not 100% sure.


The section G.3.1 states:

A complex or imaginary value with at least one infinite part is regarded as an infinity (even if its other part is a NaN).

A complex or imaginary value is a finite number if each of its parts is a finite number (neither infinite nor NaN).

A complex or imaginary value is a zero if each of its parts is a zero.

Section G.5 is concerned with binary operators and states:

For most operand types, the value of the result of a binary operator with an imaginary or complex operand is completely determined, with reference to real arithmetic, by the usual mathematical formula. For some operand types, the usual mathematical formula is problematic because of its treatment of infinities and because of undue overflow or underflow; in these cases the result satisfies certain properties (specified in G.5.1), but is not completely determined.

The section G.5.1 is concerned with multiplicative operators and states:

  1. The * and / operators satisfy the following infinity properties for all real, imaginary,and complex operands:

    —if one operand is an infinity and the other operand is a nonzero finite number or an infinity,then the result of the operator is an infinity;

...

  1. If both operands of the * operator are complex or if the second operand of the / operator is complex, the operator raises floating-point exceptions if appropriate for the calculation of the parts of the result, and may raise spurious floating-point exceptions.



回答4:


Possibly because software floating point is on by default. As per the man page for gcc:

-msoft-float
This option ignored; it is provided for compatibility purposes only.    
software floating point code is emitted by default,
and this default can overridden by FPX options; mspfp, mspfp-compact, 
or mspfp-fast for single precision, and mdpfp, mdpfp-compact, or mdpfp-fast 
for double precision.

Try compiling with -mdpfp or any of the other alternatives.



来源:https://stackoverflow.com/questions/49438158/why-is-muldc3-called-when-two-stdcomplex-are-multiplied

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!