Why does an inline function have lower efficiency than an in-built function?

后端 未结 3 431
挽巷
挽巷 2021-01-03 20:59

I was trying a question on arrays in InterviewBit. In this question I made an inline function returning the absolute value of an integer. But I was told that my algorithm wa

3条回答
  •  夕颜
    夕颜 (楼主)
    2021-01-03 21:19

    Your solution might arguably be "cleaner" by the textbook if you used the standard library version, but I think the evaluation is wrong. There is no truly good, justifiable reason for your code being rejected.

    This is one of those cases where someone is formally correct (by the textbook), but insists on knowing the only correct solution in a sheer stupid way rather than accepting an alternate solution and saying "...but this here would be best practice, you know".

    Technically, it's a correct, practical approach to say "use the standard library, that's what it is for, and it's likely optimized as much as can be". Even though the "optimized as much as can be" part can, in some situations, very well be wrong due to some constraints that the standard puts onto certain alogorithms and/or containers.

    Now, opinions, best practice, and religion aside. Factually, if you compare the two approaches...

    int main(int argc, char**)
    {
      40f360:       53                      push   %rbx
      40f361:       48 83 ec 20             sub    $0x20,%rsp
      40f365:       89 cb                   mov    %ecx,%ebx
      40f367:       e8 a4 be ff ff          callq  40b210 <__main>
    return std::abs(argc);
      40f36c:       89 da                   mov    %ebx,%edx
      40f36e:       89 d8                   mov    %ebx,%eax
      40f370:       c1 fa 1f                sar    $0x1f,%edx
      40f373:       31 d0                   xor    %edx,%eax
      40f375:       29 d0                   sub    %edx,%eax
    //}
    
    int main(int argc, char**)
    {
      40f360:       53                      push   %rbx
      40f361:       48 83 ec 20             sub    $0x20,%rsp
      40f365:       89 cb                   mov    %ecx,%ebx
      40f367:       e8 a4 be ff ff          callq  40b210 <__main>
    return (argc > 0) ? argc : -argc;
      40f36c:       89 da                   mov    %ebx,%edx
      40f36e:       89 d8                   mov    %ebx,%eax
      40f370:       c1 fa 1f                sar    $0x1f,%edx
      40f373:       31 d0                   xor    %edx,%eax
      40f375:       29 d0                   sub    %edx,%eax
    //}
    

    ... they result in exactly the same, identical instructions.

    But even if the compiler did use a compare followed by a conditional move (which it may do in more complicated "branching assignments" and which it will do for example in the case of min/max), that's maybe one CPU cycle or so slower than the bit hacks, so unless you do this several million times, the statement "not efficient" is kinda doubtful anyway.
    One cache miss, and you have a hundred times the penalty of a conditional move.

    There are valid arguments for and against either approach, which I won't discuss in length. My point is, turning down the OP's solution as "totally wrong" because of such a petty, unimportant detail is rather narrow-minded.

    EDIT:

    (Fun trivia)

    I just tried, for fun and no profit, on my Linux Mint box which uses a somewhat older version of GCC (5.4 as compared to 7.1 above).

    Due to me including without much of a thought (hey, a function like abs very clearly belongs to math, doesn't it!) rather than which hosts the integer overload, the result was, well... surprising. Calling the library function was much inferior to the single-expression wrapper.

    Now, in defense of the standard library, if you include , then, again, the produced output is exactly identical in either case.

    For reference, the test code looked like:

    #ifdef DRY
      #include 
      int main(int argc, char**)
      {
         return std::abs(argc);
      }
    #else
      int abs(int v) noexcept { return (v >= 0) ? v : -v; }
      int main(int argc, char**)
      {
         return abs(argc);
      }
    #endif
    

    ...resulting in

    4004f0: 89 fa                   mov    %edi,%edx
    4004f2: 89 f8                   mov    %edi,%eax
    4004f4: c1 fa 1f                sar    $0x1f,%edx
    4004f7: 31 d0                   xor    %edx,%eax
    4004f9: 29 d0                   sub    %edx,%eax
    4004fb: c3                      retq 
    

    Now, It is apparently quite easy to fall into the trap of unwittingly using the wrong standard library function (I demonstrated how easy it is myself!). And all that without the slightest warning from the compiler, such as "hey, you know, you're using a double overload on an integer value (well, obviously there's no warning, it's a valid conversion).

    With that in mind, there may be yet another "justification" why the OP providing his own one-liner wasn't all that terribly bad and wrong. After all, he could have made that same mistake.

提交回复
热议问题