compiler-optimization | 易学教程

Is there a (Linux) g++ equivalent to the /fp:precise and /fp:fast flags used in Visual Studio?

阅读更多关于 Is there a (Linux) g++ equivalent to the /fp:precise and /fp:fast flags used in Visual Studio?

问题 Background: Many years ago, I inherited a codebase that was using the Visual Studio (VC++) flag '/fp:fast' to produce faster code in a particular calculation-heavy library. Unfortunately, '/fp:fast' produced results that were slightly different to the same library under a different compiler (Borland C++). As we needed to produce exactly the same results, I switched to '/fp:precise', which worked fine, and everything has been peachy ever since. However, now I'm compiling the same library with

Default variables' values vs initialization with default

阅读更多关于 Default variables' values vs initialization with default

问题 We all know, that according to JLS7 p.4.12.5 every instance variable is initialized with default value. E.g. (1): public class Test { private Integer a; // == null private int b; // == 0 private boolean c; // == false } But I always thought, that such class implementation (2): public class Test { private Integer a = null; private int b = 0; private boolean c = false; } is absolutely equal to example (1). I expected, that sophisticated Java compiler see that all these initialization values in

Optimization of raw new[]/delete[] vs std::vector

阅读更多关于 Optimization of raw new[]/delete[] vs std::vector

问题 Let's mess around with very basic dynamically allocated memory. We take a vector of 3, set its elements and return the sum of the vector. In the first test case I used a raw pointer with new[] / delete[] . In the second I used std::vector : #include <vector> int main() { //int *v = new int[3]; // (1) auto v = std::vector<int>(3); // (2) for (int i = 0; i < 3; ++i) v[i] = i + 1; int s = 0; for (int i = 0; i < 3; ++i) s += v[i]; //delete[] v; // (1) return s; } Assembly of (1) ( new[] / delete[

Optimization of raw new[]/delete[] vs std::vector

阅读更多关于 Optimization of raw new[]/delete[] vs std::vector

GCC -Wuninitialized / -Wmaybe-uninitialized issues

阅读更多关于 GCC -Wuninitialized / -Wmaybe-uninitialized issues

问题 I am experiencing a very strange issue using gcc-4.7 (Ubuntu/Linaro 4.7.2-11precise2) 4.7.2 . I am unable to compile the following valid code without a warning: extern void dostuff(void); int test(int arg1, int arg2) { int ret; if (arg1) ret = arg2 ? 1 : 2; dostuff(); if (arg1) return ret; return 0; } Compile options and output: $ gcc-4.7 -o test.o -c -Os test.c -Wall test.c: In function ‘test’: test.c:5:6: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]

When to use volatile to counteract compiler optimizations in C#

阅读更多关于 When to use volatile to counteract compiler optimizations in C#

问题 I have spent an extensive number of weeks doing multithreaded coding in C# 4.0. However, there is one question that remains unanswered for me. I understand that the volatile keyword prevents the compiler from storing variables in registers, thus avoiding inadvertently reading stale values. Writes are always volatile in .Net, so any documentation stating that it also avoids stales writes is redundant. I also know that the compiler optimization is somewhat "unpredictable". The following code

May compiler optimizations be inhibited by multi-threading?

阅读更多关于 May compiler optimizations be inhibited by multi-threading?

问题 It happened to me a few times to parallelize portion of programs with OpenMP just to notice that in the end, despite the good scalability, most of the foreseen speed-up was lost due to the poor performance of the single threaded case (if compared to the serial version). The usual explanation that appears on the web for this behavior is that the code generated by compilers may be worse in the multi-threaded case . Anyhow I am not able to find anywhere a reference that explains why the assembly

Limits of Nat type in Shapeless

阅读更多关于 Limits of Nat type in Shapeless

问题 In shapeless, the Nat type represents a way to encode natural numbers at a type level. This is used for example for fixed size lists. You can even do calculations on type level, e.g. append a list of N elements to a list of K elements and get back a list that is known at compile time to have N+K elements. Is this representation capable of representing large numbers, e.g. 1000000 or 2 53 , or will this cause the Scala compiler to give up? 回答1: I will attempt one myself. I will gladly accept a

Is it possible to guarantee code doing memory writes is not optimized away in C++?

阅读更多关于 Is it possible to guarantee code doing memory writes is not optimized away in C++?

问题 C++ compilers are allowed to optimize away writes into memory: { //all this code can be eliminated char buffer[size]; std::fill_n( buffer, size, 0); } When dealing with sensitive data the typical approach is using volatile* pointers to ensure that memory writes are emitted by the compiler. Here's how SecureZeroMemory() function in Visual C++ runtime library is implemented (WinNT.h): FORCEINLINE PVOID RtlSecureZeroMemory( __in_bcount(cnt) PVOID ptr, __in SIZE_T cnt ) { volatile char *vptr =

Do C++11 compilers turn local variables into rvalues when they can during code optimization?

阅读更多关于 Do C++11 compilers turn local variables into rvalues when they can during code optimization?

问题 Sometimes it's wise to split complicated or long expressions into multiple steps, for example (the 2nd version isn't more clear, but it's just an example): return object1(object2(object3(x))); can be written as: object3 a(x); object2 b(a); object1 c(b); return c; Assuming all 3 classes implement constructors that take rvalue as a parameter, the first version might be faster, because temporary objects are passed and can be moved. I'm assuming that in the 2nd version, the local variables are