compiler-optimization | 易学教程

How to conditionally set compiler optimization for template headers

阅读更多关于 How to conditionally set compiler optimization for template headers

问题 I found a question somewhat interesting, and went on an attempt to answer it. The author wants to compile -one- source file (which relies on template libraries) with AVX optimizations, and the rest of the project without those. So, to see what would happen, I've created a test project like this: main.cpp #include <iostream> #include <string> #include "fn_normal.h" #include "fn_avx.h" int main(int argc, char* argv[]) { int number = 10; // this will come from input, but let's keep it simple for

Disabling -msse

阅读更多关于 Disabling -msse

问题 I am trying to run various benchmark tests using CPU2006 to see what various optimizations do in terms of speed on gcc. I am familiar with -O1, -O2, and -O3, but have heard that -msse is a decent optimization. What exactly is -msse? I've also seen that -msse is default on a 64 bit architecture, so how do I disable it to compare the difference between using it and not using it? 回答1: http://www.justskins.com/forums/gcc-option-msse-and-128289.html SSE (http://it.wikipedia.org/wiki/Streaming_SIMD

Avoid .NET Native bugs

阅读更多关于 Avoid .NET Native bugs

问题 I spent the last year (part time) to migrate my existing (and successful) Windows 8.1 app to Windows 10 UWP. Now, just before releasing it to the store, I tested the app in the "Release" build mode (which triggers .NET Native). Everything seemed to work until I - by chance - noted a subtle but serious (because data-compromising) bug. It took me two days to reduce it to these three lines of code: var array1 = new int[1, 1]; var array2 = (int[,])array1.Clone(); array2[0, 0] = 666; if (array1[0,

In VC++ what is the #pragma equivalent of /O2 compiler option (optimize for speed)

阅读更多关于 In VC++ what is the #pragma equivalent of /O2 compiler option (optimize for speed)

问题 According to msdn, /O2 (Maximize Speed) is equivalent to /Og/Oi/Ot/Oy/Ob2/Gs/GF/Gy and according to msdn again, the following pragma #pragma optimize( "[optimization-list]", {on | off} ) uses the same letters in its "optimization-list" than the /O compiler option. Available letters for the pragma are: g - Enable global optimizations. p - Improve floating-point consistency. s or t - Specify short or fast sequences of machine code. y - Generate frame pointers on the program stack. Which ones

Why does the compiler optimize away shared memory reads due to strncmp() even if volatile keyword is used?

阅读更多关于 Why does the compiler optimize away shared memory reads due to strncmp() even if volatile keyword is used?

问题 Here is a program foo.c that writes data to shared memory. #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <string.h> #include <stdint.h> #include <unistd.h> #include <sys/ipc.h> #include <sys/shm.h> int main() { key_t key; int shmid; char *mem; if ((key = ftok("ftok", 0)) == -1) { perror("ftok"); return 1; } if ((shmid = shmget(key, 100, 0600 | IPC_CREAT)) == -1) { perror("shmget"); return 1; } printf("key: 0x%x; shmid: %d\n", key, shmid); if ((mem = shmat(shmid, NULL, 0))

Does source code amalgamation really increase the performances of a C or C++ program? [closed]

阅读更多关于 Does source code amalgamation really increase the performances of a C or C++ program? [closed]

问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 3 years ago . Code amalgamation consists in copying the whole source code in one single file. For instance, it is done by SQLite to reduce the compile time and increase the performances of the resulting executable. Here, it results in one file of 184K lines of code. My question is not

Does the swift compiler/linker automatically remove unused methods/classes/extensions, etc.?

阅读更多关于 Does the swift compiler/linker automatically remove unused methods/classes/extensions, etc.?

问题 We have a lot of code which is usable in any iOS application we write. Things such as: Custom/Common Controls Extensions on common objects like UIView, UIImage and UIViewController Global utility functions Global constants Related sets of files that make up common 'features' like a picker screen that can be used with anything that can be enumerated. For reasons unrelated to this question, we cannot use static or dynamic libraries. These must be included in the project as actual source files.

What's the advantage of compiler instruction scheduling compared to dynamic scheduling? [closed]

阅读更多关于 What's the advantage of compiler instruction scheduling compared to dynamic scheduling? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 5 years ago . Nowadays, super-scalar RISC cpus usually support out-of-order execution, with branch prediction and speculative execution. They schedule work dynamically. What's the advantage of compiler instruction scheduling, compared to an out-of-order CPU's dynamic scheduling? Does compile-time static scheduling matter at

Can compiler optimization elminate a function repeatedly called in a for-loop's conditional?

阅读更多关于 Can compiler optimization elminate a function repeatedly called in a for-loop's conditional?

问题 I was reading about hash functions (i'm an intermediate CS student) and came across this: int hash (const string & key, int tableSize) { int hasVal = 0; for (int i = 0; i < key.length(); i++) hashVal = 37 * hashVal + key[i]; ..... return hashVal; } I was looking at this code and noticed that it would be faster if in the for-loop instead of calling key.length() each time we instead did this: int n = key.length(); for (int i = 0; i < n; i++) My question is, since this is such an obvious way to

Is Clang really this smart?

阅读更多关于 Is Clang really this smart?

问题 If I compile the following code with Clang 3.3 using -O3 -fno-vectorize I get the same assembly output even if I remove the commented line. The code type puns all possible 32-bit integers to floats and counts the ones in a [0, 1] range. Is Clang's optimizer actually smart enough to realize that 0xFFFFFFFF when punned to float is not in the range [0, 1], so ignore the second call to fn entirely? GCC produces different code when the second call is removed. #include <limits> #include <cstring>