How can I improve performance via a high-level approach when implementing long equations in C++

后端 未结 10 1911
孤街浪徒
孤街浪徒 2021-01-30 19:34

I am developing some engineering simulations. This involves implementing some long equations such as this equation to calculate stress in a rubber like material:



        
10条回答
  •  野性不改
    2021-01-30 19:52

    David Hammen's answer is good, but still far from optimal. Let's continue with his last expression (at the time of writing this)

    auto l123 = l1 * l2 * l3;
    auto cbrt_l123 = cbrt(l123);
    T = mu/(3.0*l123)*(  pow(l1/cbrt_l123,a)*(2.0*N1-N2-N3)
                       + pow(l2/cbrt_l123,a)*(2.0*N2-N3-N1)
                       + pow(l3/cbrt_l123,a)*(2.0*N3-N1-N2))
      + K*(l123-1.0)*(N1+N2+N3);
    

    which can be optimised further. In particular, we can avoid the call to cbrt() and one of the calls to pow() if exploiting some mathematical identities. Let's do this again step by step.

    // step 1 eliminate cbrt() by taking the exponent into pow()
    auto l123 = l1 * l2 * l3;
    auto athird = 0.33333333333333333 * a; // avoid division
    T = mu/(3.0*l123)*(  (N1+N1-N2-N3)*pow(l1*l1/(l2*l3),athird)
                       + (N2+N2-N3-N1)*pow(l2*l2/(l1*l3),athird)
                       + (N3+N3-N1-N2)*pow(l3*l3/(l1*l2),athird))
      + K*(l123-1.0)*(N1+N2+N3);
    

    Note that I have also optimised 2.0*N1 to N1+N1 etc. Next, we can do with only two calls to pow().

    // step 2  eliminate one call to pow
    auto l123 = l1 * l2 * l3;
    auto athird = 0.33333333333333333 * a;
    auto pow_l1l2_athird = pow(l1/l2,athird);
    auto pow_l1l3_athird = pow(l1/l3,athird);
    auto pow_l2l3_athird = pow_l1l3_athird/pow_l1l2_athird;
    T = mu/(3.0*l123)*(  (N1+N1-N2-N3)* pow_l1l2_athird*pow_l1l3_athird
                       + (N2+N2-N3-N1)* pow_l2l3_athird/pow_l1l2_athird
                       + (N3+N3-N1-N2)/(pow_l1l3_athird*pow_l2l3_athird))
      + K*(l123-1.0)*(N1+N2+N3);
    

    Since the calls to pow() are by far the most costly operation here, it is worth to reduce them as far as possible (the next costly operation was the call to cbrt(), which we eliminated).

    If by any chance a is integer, the calls to pow could be optimized to calls to cbrt (plus integer powers), or if athird is half-integer, we can use sqrt (plus integer powers). Furthermore, if by any chance l1==l2 or l1==l3 or l2==l3 one or both calls to pow can be eliminated. So, it's worth to consider these as special cases if such chances realistically exist.

提交回复
热议问题