Why does an inline function have lower efficiency than an in-built function?

后端未结

关注

 3  431

挽巷 2021-01-03 20:59

I was trying a question on arrays in InterviewBit. In this question I made an inline function returning the absolute value of an integer. But I was told that my algorithm wa

3条回答

夕颜 (楼主)

2021-01-03 21:19
Your solution might arguably be "cleaner" by the textbook if you used the standard library version, but I think the evaluation is wrong. There is no truly good, justifiable reason for your code being rejected.

This is one of those cases where someone is formally correct (by the textbook), but insists on knowing the only correct solution in a sheer stupid way rather than accepting an alternate solution and saying "...but this here would be best practice, you know".

Technically, it's a correct, practical approach to say "use the standard library, that's what it is for, and it's likely optimized as much as can be". Even though the "optimized as much as can be" part can, in some situations, very well be wrong due to some constraints that the standard puts onto certain alogorithms and/or containers.

Now, opinions, best practice, and religion aside. Factually, if you compare the two approaches...
```
int main(int argc, char**)
{
  40f360:       53                      push   %rbx
  40f361:       48 83 ec 20             sub    $0x20,%rsp
  40f365:       89 cb                   mov    %ecx,%ebx
  40f367:       e8 a4 be ff ff          callq  40b210 <__main>
return std::abs(argc);
  40f36c:       89 da                   mov    %ebx,%edx
  40f36e:       89 d8                   mov    %ebx,%eax
  40f370:       c1 fa 1f                sar    $0x1f,%edx
  40f373:       31 d0                   xor    %edx,%eax
  40f375:       29 d0                   sub    %edx,%eax
//}

int main(int argc, char**)
{
  40f360:       53                      push   %rbx
  40f361:       48 83 ec 20             sub    $0x20,%rsp
  40f365:       89 cb                   mov    %ecx,%ebx
  40f367:       e8 a4 be ff ff          callq  40b210 <__main>
return (argc > 0) ? argc : -argc;
  40f36c:       89 da                   mov    %ebx,%edx
  40f36e:       89 d8                   mov    %ebx,%eax
  40f370:       c1 fa 1f                sar    $0x1f,%edx
  40f373:       31 d0                   xor    %edx,%eax
  40f375:       29 d0                   sub    %edx,%eax
//}
```
... they result in exactly the same, identical instructions.

But even if the compiler did use a compare followed by a conditional move (which it may do in more complicated "branching assignments" and which it will do for example in the case of min/max), that's maybe one CPU cycle or so slower than the bit hacks, so unless you do this several million times, the statement "not efficient" is kinda doubtful anyway.
One cache miss, and you have a hundred times the penalty of a conditional move.

There are valid arguments for and against either approach, which I won't discuss in length. My point is, turning down the OP's solution as "totally wrong" because of such a petty, unimportant detail is rather narrow-minded.

EDIT:

(Fun trivia)

I just tried, for fun and no profit, on my Linux Mint box which uses a somewhat older version of GCC (5.4 as compared to 7.1 above).

Due to me including without much of a thought (hey, a function like abs very clearly belongs to math, doesn't it!) rather than which hosts the integer overload, the result was, well... surprising. Calling the library function was much inferior to the single-expression wrapper.

Now, in defense of the standard library, if you include , then, again, the produced output is exactly identical in either case.

For reference, the test code looked like:
```
#ifdef DRY
  #include 
  int main(int argc, char**)
  {
     return std::abs(argc);
  }
#else
  int abs(int v) noexcept { return (v >= 0) ? v : -v; }
  int main(int argc, char**)
  {
     return abs(argc);
  }
#endif
```
...resulting in
```
4004f0: 89 fa                   mov    %edi,%edx
4004f2: 89 f8                   mov    %edi,%eax
4004f4: c1 fa 1f                sar    $0x1f,%edx
4004f7: 31 d0                   xor    %edx,%eax
4004f9: 29 d0                   sub    %edx,%eax
4004fb: c3                      retq 
```
Now, It is apparently quite easy to fall into the trap of unwittingly using the wrong standard library function (I demonstrated how easy it is myself!). And all that without the slightest warning from the compiler, such as "hey, you know, you're using a double overload on an integer value (well, obviously there's no warning, it's a valid conversion).

With that in mind, there may be yet another "justification" why the OP providing his own one-liner wasn't all that terribly bad and wrong. After all, he could have made that same mistake.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...