compiler-optimization | 易学教程

Do modern compilers optimize multiplication by 1 and -1

阅读更多关于 Do modern compilers optimize multiplication by 1 and -1

问题 If I write template<int sign> inline int add_sign(int x) { return sign * x; } template int add_sign<-1>(int x); template int add_sign<1>(int x); Are most C++ compilers smart enough to optimize the multiplication by 1 or -1 into some faster operation (no-op or negation)? 回答1: Yes. This is part of a class of simple optimizations known as arithmetic local optimizations. For example 1 * x can be simplified statically to x , likewise -1 * x can be simplified to -x . Production compilers all do

How could this Java code be sped up?

阅读更多关于 How could this Java code be sped up?

问题 I am trying to benchmark how fast can Java do a simple task: read a huge file into memory and then perform some meaningless calculations on the data. All types of optimizations count. Whether it's rewriting the code differently or using a different JVM, tricking JIT .. Input file is a 500 million long list of 32 bit integer pairs separated by a comma. Like this: 44439,5023 33140,22257 ... This file takes 5.5GB on my machine. The program can't use more than 8GB of RAM and can use only a single

Will a static variable always use up memory?

阅读更多关于 Will a static variable always use up memory?

问题 Based on this discussion, I was wondering if a function scope static variable always uses memory or if the compiler is allowed to optimize that away. To illustrate the question, assume a function like such: void f() { static const int i = 3; int j = i + 1; printf("%d", j); } The compiler will very likely inline the value of i and probably do the calculation 3 + 1 at compile time, too. Since this is the only place the value of i is used, there is no need for any static memory being allocated.

C/C++ Indeterminate Values: Compiler optimization gives different output (example)

阅读更多关于 C/C++ Indeterminate Values: Compiler optimization gives different output (example)

问题 It seems like the C/C++ compiler (clang, gcc, etc) produces different output related to the optimization level. You may as well check the online link included in this post. http://cpp.sh/5vrmv (change output from none to -O3 to see the differences). Based on the following piece of code, could someone explain a few questions I have: #include <stdio.h> #include <stdlib.h> int main(void) { int *p = (int *)malloc(sizeof(int)); free(p); int *q = (int *)malloc(sizeof(int)); if (p == q) { *p = 10;

When compiling programs to run inside a VM, what should march and mtune be set to?

阅读更多关于 When compiling programs to run inside a VM, what should march and mtune be set to?

问题 With VMs being slave to whatever the host machine is providing, what compiler flags should be provided to gcc? I would normally think that -march=native would be what you would use when compiling for a dedicated box, but the fine detail that -march=native is going to as indicated in this article makes me extremely wary of using it. So... what to set -march and -mtune to inside a VM? For a specific example... My specific case right now is compiling python (and more) in a linux guest inside a

Java optimizer and redundant array evaluations

阅读更多关于 Java optimizer and redundant array evaluations

问题 This is a very basic question about the Java optimization. If you have a simple for loop to iterate through an array and use array.length in the header of the loop rather than evaluating it before so that you do it only once (which is what I almost always do): for(int i=0; i<array.length;i++) { ... } Can the statement be optimized so that the JVM knows whether the array is changing for the duration of the loop so that it does not reevaluate array.length every time? 回答1: if another thread is

Visual C++ Compiler Optimization Flags: Difference Between /O2 and /Ot

阅读更多关于 Visual C++ Compiler Optimization Flags: Difference Between /O2 and /Ot

问题 What's the difference between the /Ot flag ("favor fast code") and the /O2 flag ("maximize speed")? (Ditto with /Os and /O1 .) 回答1: /O1 and /O2 bundle together a number of options aimed at a larger goal. So /O1 makes a number of code generation choices that favour size; /O2 does the same thing and favours speed. /O1 includes /Os as well as other options. /O2 includes /Ot as well as other options. Some optimisations are enabled by both /O1 and /O2. And, depending on your program's paging

Does undefined behavior really help modern compilers to optimize generated code?

阅读更多关于 Does undefined behavior really help modern compilers to optimize generated code?

问题 Aren't modern compilers smart enough to be able to generate a code that is fast and safe at the same time? Look at the code below: std::vector<int> a(100); for (int i = 0; i < 50; i++) { a.at(i) = i; } ... It's obvious that the out of range error will never happen here, and a smart compiler can generate the next code: std::vector<int> a(100); for (int i = 0; i < 50; i++) { a[i] = i; } // operator[] doesn't check for out of range ... Now let's check this code: std::vector<int> a(unknown

GCC 5.1 Loop unrolling

阅读更多关于 GCC 5.1 Loop unrolling

问题 Given the following code #include <stdio.h> int main(int argc, char **argv) { int k = 0; for( k = 0; k < 20; ++k ) { printf( "%d\n", k ) ; } } Using GCC 5.1 or later with -x c -std=c99 -O3 -funroll-all-loops --param max-completely-peeled-insns=1000 --param max-completely-peel-times=10000 does partially loop unrolling, it unrolls the loop ten times and then does a conditional jump. .LC0: .string "%d\n" main: pushq %rbx xorl %ebx, %ebx .L2: movl %ebx, %esi movl $.LC0, %edi xorl %eax, %eax call

What does “sibling calls” mean?

阅读更多关于 What does “sibling calls” mean?

问题 On GCC manual, -foptimize-sibling-calls Optimize sibling and tail recursive calls. I know tail recursive calls, for example int sum(int n) { return n == 1 ? 1 : n + sum(n-1); } However, what does sibling calls mean? 回答1: the compiler considers two functions as being siblings if they share the same structural equivalence of return types, as well as matching space requirements of their arguments. http://www.drdobbs.com/tackling-c-tail-calls/184401756 回答2: It must be something like this: int