compiler-optimization | 易学教程

Get GCC To Use Carry Logic For Arbitrary Precision Arithmetic Without Inline Assembly?

阅读更多关于 Get GCC To Use Carry Logic For Arbitrary Precision Arithmetic Without Inline Assembly?

When working with arbitrary precision arithmetic (e.g. 512-bit integers), is there any way to get GCC to use ADC and similar instructions without using inline assembly? A first glance at GMP's sourcecode shows that they simply have assembly implementations for every supported platform. Here is the test code I wrote, which adds two 128-bit numbers from the command line and prints the result. (Inspired by mini-gmp's add_n): #include <stdio.h> #include <stdint.h> #include <stdlib.h> int main (int argc, char **argv) { uint32_t a[4]; uint32_t b[4]; uint32_t c[4]; uint32_t carry = 0; for (int i = 0;

Does C/C++ offer any guarantee on minimal execution time?

阅读更多关于 Does C/C++ offer any guarantee on minimal execution time?

问题 Why do compilers seems to be polite toward loops that do nothing and do not eliminate them? Does the C standard require loops to take some time? Example, the following code: void foo(void) { while(1) { for(int k = 0; k < 1000000000; ++k); printf("Foo\n"); } } runs slower than this one: void foo(void) { while(1) { for(int k = 0; k < 1000; ++k); printf("Foo\n"); } } even with -O3 optimization level. I would expect removing empty loops allowed and thus get the same speed on both codes. Is "time

Why doesn't GCC optimize out deletion of null pointers in C++?

阅读更多关于 Why doesn't GCC optimize out deletion of null pointers in C++?

问题 Consider a simple program: int main() { int* ptr = nullptr; delete ptr; } With GCC (7.2), there is a call instruction regarding to operator delete in the resulting program. With Clang and Intel compilers, there are no such instructions, the null pointer deletion is completely optimized out ( -O2 in all cases). You can test here: https://godbolt.org/g/JmdoJi. I wonder whether such an optimization can be somehow turned on with GCC? (My broader motivation stems from a problem of custom swap vs

Why doesn't 'd /= d' throw a division by zero exception when d == 0?

阅读更多关于 Why doesn't 'd /= d' throw a division by zero exception when d == 0?

问题 I don't quite understand why I don't get a division by zero exception: int d = 0; d /= d; I expected to get a division by zero exception but instead d == 1 . Why doesn't d /= d throw a division by zero exception when d == 0 ? 回答1: C++ does not have a "Division by Zero" Exception to catch. The behavior you're observing is the result of Compiler optimizations: The compiler assumes Undefined Behavior doesn't happen Division by Zero in C++ is undefined behavior Therefore, code which can cause a

Does const allow for (theoretical) optimization here?

阅读更多关于 Does const allow for (theoretical) optimization here?

Consider this snippet: void foo(const int&); int bar(); int test1() { int x = bar(); int y = x; foo(x); return x - y; } int test2() { const int x = bar(); const int y = x; foo(x); return x - y; } In my understanding of the standard, neither x nor y are allowed to be changed by foo in test2 , whereas they could be changed by foo in test1 (with e.g. a const_cast to remove const from the const int& because the referenced objects aren't actually const in test1 ). Now, neither gcc nor clang nor MSVC seem to optimize test2 to foo(bar()); return 0; , and I can understand that they do not want to

Are compilers allowed to optimize-out exception throws?

阅读更多关于 Are compilers allowed to optimize-out exception throws?

We have been discussing this topic today at work, and none of us could come up with a definitive answer about that question. Consider the following situation: int foo() { int err; err = some_call(1); if (err != 0) return err; err = some_call(2); if (err != 0) return err; err = some_call(3); if (err != 0) return err; err = some_call(4); if (err != 0) return err; bar(); return err; } There is a lot of code repetition. Obviously, this could be factorized with a macro, sadly not with a template (because of the return clause). Or at least not directly. Now the question is, if we were to replace

Rewriting as a practical optimization technique in GHC: Is it really needed?

阅读更多关于 Rewriting as a practical optimization technique in GHC: Is it really needed?

问题 I was reading the paper authored by Simon Peyton Jones, et al. named “Playing by the Rules: Rewriting as a practical optimization technique in GHC”. In the second section, namely “The basic idea” they write: Consider the familiar map function, that applies a function to each element of a list. Written in Haskell, map looks like this: map f [] = [] map f (x:xs) = f x : map f xs Now suppose that the compiler encounters the following call of map : map f (map g xs) We know that this expression is

Will C++ linker automatically inline functions (without “inline” keyword, without implementation in header)?

阅读更多关于 Will C++ linker automatically inline functions (without “inline” keyword, without implementation in header)?

Will the C++ linker automatically inline "pass-through" functions, which are NOT defined in the header, and NOT explicitly requested to be "inlined" through the inline keyword? For example, the following happens so often , and should always benefit from "inlining", that it seems every compiler vendor should have "automatically" handled it through "inlining" through the linker (in those cases where it is possible): //FILE: MyA.hpp class MyA { public: int foo(void) const; }; //FILE: MyB.hpp class MyB { private: MyA my_a_; public: int foo(void) const; }; //FILE: MyB.cpp // PLEASE SAY THIS

Why doesn't a compiler optimize floating-point *2 into an exponent increment?

阅读更多关于 Why doesn't a compiler optimize floating-point *2 into an exponent increment?

I've often noticed gcc converting multiplications into shifts in the executable. Something similar might happen when multiplying an int and a float . For example, 2 * f , might simply increment the exponent of f by 1, saving some cycles. Do the compilers, perhaps if one requests them to do so (e.g. via -ffast-math ), in general, do it? Are compilers generally smart enough to do this, or do I need to do this myself using the scalb*() or ldexp()/frexp() function family? For example, 2 * f, might simply increment the exponent of f by 1, saving some cycles. This simply isn't true. First you have

Why isn't this unused variable optimised away?

阅读更多关于 Why isn't this unused variable optimised away?

I played around with Godbolt's CompilerExplorer. I wanted to see how good certain optimizations are. My minimum working example is: #include <vector> int foo() { std::vector<int> v {1, 2, 3, 4, 5}; return v[4]; } The generated assembler (by clang 5.0.0, -O2 -std=c++14): foo(): # @foo() push rax mov edi, 20 call operator new(unsigned long) mov rdi, rax call operator delete(void*) mov eax, 5 pop rcx ret As one can see, clang knows the answer, but does quite a lot of stuff before returning. It seems to my that even the vector is created, because of "operator new/delete". Can anyone explain to me