compiler-optimization

Get GCC To Use Carry Logic For Arbitrary Precision Arithmetic Without Inline Assembly?

喜夏-厌秋 提交于 2019-12-03 06:49:08
When working with arbitrary precision arithmetic (e.g. 512-bit integers), is there any way to get GCC to use ADC and similar instructions without using inline assembly? A first glance at GMP's sourcecode shows that they simply have assembly implementations for every supported platform. Here is the test code I wrote, which adds two 128-bit numbers from the command line and prints the result. (Inspired by mini-gmp's add_n): #include <stdio.h> #include <stdint.h> #include <stdlib.h> int main (int argc, char **argv) { uint32_t a[4]; uint32_t b[4]; uint32_t c[4]; uint32_t carry = 0; for (int i = 0;

Does C/C++ offer any guarantee on minimal execution time?

╄→尐↘猪︶ㄣ 提交于 2019-12-03 06:29:09
问题 Why do compilers seems to be polite toward loops that do nothing and do not eliminate them? Does the C standard require loops to take some time? Example, the following code: void foo(void) { while(1) { for(int k = 0; k < 1000000000; ++k); printf("Foo\n"); } } runs slower than this one: void foo(void) { while(1) { for(int k = 0; k < 1000; ++k); printf("Foo\n"); } } even with -O3 optimization level. I would expect removing empty loops allowed and thus get the same speed on both codes. Is "time

Why doesn't GCC optimize out deletion of null pointers in C++?

℡╲_俬逩灬. 提交于 2019-12-03 06:28:37
问题 Consider a simple program: int main() { int* ptr = nullptr; delete ptr; } With GCC (7.2), there is a call instruction regarding to operator delete in the resulting program. With Clang and Intel compilers, there are no such instructions, the null pointer deletion is completely optimized out ( -O2 in all cases). You can test here: https://godbolt.org/g/JmdoJi. I wonder whether such an optimization can be somehow turned on with GCC? (My broader motivation stems from a problem of custom swap vs

Why doesn't 'd /= d' throw a division by zero exception when d == 0?

霸气de小男生 提交于 2019-12-03 06:26:14
问题 I don't quite understand why I don't get a division by zero exception: int d = 0; d /= d; I expected to get a division by zero exception but instead d == 1 . Why doesn't d /= d throw a division by zero exception when d == 0 ? 回答1: C++ does not have a "Division by Zero" Exception to catch. The behavior you're observing is the result of Compiler optimizations: The compiler assumes Undefined Behavior doesn't happen Division by Zero in C++ is undefined behavior Therefore, code which can cause a

Does const allow for (theoretical) optimization here?

泪湿孤枕 提交于 2019-12-03 06:16:48
Consider this snippet: void foo(const int&); int bar(); int test1() { int x = bar(); int y = x; foo(x); return x - y; } int test2() { const int x = bar(); const int y = x; foo(x); return x - y; } In my understanding of the standard, neither x nor y are allowed to be changed by foo in test2 , whereas they could be changed by foo in test1 (with e.g. a const_cast to remove const from the const int& because the referenced objects aren't actually const in test1 ). Now, neither gcc nor clang nor MSVC seem to optimize test2 to foo(bar()); return 0; , and I can understand that they do not want to

Are compilers allowed to optimize-out exception throws?

核能气质少年 提交于 2019-12-03 05:49:59
We have been discussing this topic today at work, and none of us could come up with a definitive answer about that question. Consider the following situation: int foo() { int err; err = some_call(1); if (err != 0) return err; err = some_call(2); if (err != 0) return err; err = some_call(3); if (err != 0) return err; err = some_call(4); if (err != 0) return err; bar(); return err; } There is a lot of code repetition. Obviously, this could be factorized with a macro, sadly not with a template (because of the return clause). Or at least not directly. Now the question is, if we were to replace

Rewriting as a practical optimization technique in GHC: Is it really needed?

心不动则不痛 提交于 2019-12-03 05:39:48
问题 I was reading the paper authored by Simon Peyton Jones, et al. named “Playing by the Rules: Rewriting as a practical optimization technique in GHC”. In the second section, namely “The basic idea” they write: Consider the familiar map function, that applies a function to each element of a list. Written in Haskell, map looks like this: map f [] = [] map f (x:xs) = f x : map f xs Now suppose that the compiler encounters the following call of map : map f (map g xs) We know that this expression is

Will C++ linker automatically inline functions (without “inline” keyword, without implementation in header)?

感情迁移 提交于 2019-12-03 05:19:14
Will the C++ linker automatically inline "pass-through" functions, which are NOT defined in the header, and NOT explicitly requested to be "inlined" through the inline keyword? For example, the following happens so often , and should always benefit from "inlining", that it seems every compiler vendor should have "automatically" handled it through "inlining" through the linker (in those cases where it is possible): //FILE: MyA.hpp class MyA { public: int foo(void) const; }; //FILE: MyB.hpp class MyB { private: MyA my_a_; public: int foo(void) const; }; //FILE: MyB.cpp // PLEASE SAY THIS

Why doesn't a compiler optimize floating-point *2 into an exponent increment?

房东的猫 提交于 2019-12-03 05:13:56
I've often noticed gcc converting multiplications into shifts in the executable. Something similar might happen when multiplying an int and a float . For example, 2 * f , might simply increment the exponent of f by 1, saving some cycles. Do the compilers, perhaps if one requests them to do so (e.g. via -ffast-math ), in general, do it? Are compilers generally smart enough to do this, or do I need to do this myself using the scalb*() or ldexp()/frexp() function family? For example, 2 * f, might simply increment the exponent of f by 1, saving some cycles. This simply isn't true. First you have

Why isn't this unused variable optimised away?

扶醉桌前 提交于 2019-12-03 04:13:47
I played around with Godbolt's CompilerExplorer. I wanted to see how good certain optimizations are. My minimum working example is: #include <vector> int foo() { std::vector<int> v {1, 2, 3, 4, 5}; return v[4]; } The generated assembler (by clang 5.0.0, -O2 -std=c++14): foo(): # @foo() push rax mov edi, 20 call operator new(unsigned long) mov rdi, rax call operator delete(void*) mov eax, 5 pop rcx ret As one can see, clang knows the answer, but does quite a lot of stuff before returning. It seems to my that even the vector is created, because of "operator new/delete". Can anyone explain to me