compiler-optimization

How to conditionally set compiler optimization for template headers

早过忘川 提交于 2019-12-23 07:32:22
问题 I found a question somewhat interesting, and went on an attempt to answer it. The author wants to compile -one- source file (which relies on template libraries) with AVX optimizations, and the rest of the project without those. So, to see what would happen, I've created a test project like this: main.cpp #include <iostream> #include <string> #include "fn_normal.h" #include "fn_avx.h" int main(int argc, char* argv[]) { int number = 10; // this will come from input, but let's keep it simple for

Disabling -msse

自闭症网瘾萝莉.ら 提交于 2019-12-22 18:36:37
问题 I am trying to run various benchmark tests using CPU2006 to see what various optimizations do in terms of speed on gcc. I am familiar with -O1, -O2, and -O3, but have heard that -msse is a decent optimization. What exactly is -msse? I've also seen that -msse is default on a 64 bit architecture, so how do I disable it to compare the difference between using it and not using it? 回答1: http://www.justskins.com/forums/gcc-option-msse-and-128289.html SSE (http://it.wikipedia.org/wiki/Streaming_SIMD

Avoid .NET Native bugs

泪湿孤枕 提交于 2019-12-22 10:57:19
问题 I spent the last year (part time) to migrate my existing (and successful) Windows 8.1 app to Windows 10 UWP. Now, just before releasing it to the store, I tested the app in the "Release" build mode (which triggers .NET Native). Everything seemed to work until I - by chance - noted a subtle but serious (because data-compromising) bug. It took me two days to reduce it to these three lines of code: var array1 = new int[1, 1]; var array2 = (int[,])array1.Clone(); array2[0, 0] = 666; if (array1[0,

In VC++ what is the #pragma equivalent of /O2 compiler option (optimize for speed)

夙愿已清 提交于 2019-12-22 10:27:49
问题 According to msdn, /O2 (Maximize Speed) is equivalent to /Og/Oi/Ot/Oy/Ob2/Gs/GF/Gy and according to msdn again, the following pragma #pragma optimize( "[optimization-list]", {on | off} ) uses the same letters in its "optimization-list" than the /O compiler option. Available letters for the pragma are: g - Enable global optimizations. p - Improve floating-point consistency. s or t - Specify short or fast sequences of machine code. y - Generate frame pointers on the program stack. Which ones

Why does the compiler optimize away shared memory reads due to strncmp() even if volatile keyword is used?

好久不见. 提交于 2019-12-22 09:57:09
问题 Here is a program foo.c that writes data to shared memory. #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <string.h> #include <stdint.h> #include <unistd.h> #include <sys/ipc.h> #include <sys/shm.h> int main() { key_t key; int shmid; char *mem; if ((key = ftok("ftok", 0)) == -1) { perror("ftok"); return 1; } if ((shmid = shmget(key, 100, 0600 | IPC_CREAT)) == -1) { perror("shmget"); return 1; } printf("key: 0x%x; shmid: %d\n", key, shmid); if ((mem = shmat(shmid, NULL, 0))

Does source code amalgamation really increase the performances of a C or C++ program? [closed]

╄→尐↘猪︶ㄣ 提交于 2019-12-22 09:50:34
问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 3 years ago . Code amalgamation consists in copying the whole source code in one single file. For instance, it is done by SQLite to reduce the compile time and increase the performances of the resulting executable. Here, it results in one file of 184K lines of code. My question is not

Does the swift compiler/linker automatically remove unused methods/classes/extensions, etc.?

陌路散爱 提交于 2019-12-22 08:26:04
问题 We have a lot of code which is usable in any iOS application we write. Things such as: Custom/Common Controls Extensions on common objects like UIView, UIImage and UIViewController Global utility functions Global constants Related sets of files that make up common 'features' like a picker screen that can be used with anything that can be enumerated. For reasons unrelated to this question, we cannot use static or dynamic libraries. These must be included in the project as actual source files.

What's the advantage of compiler instruction scheduling compared to dynamic scheduling? [closed]

会有一股神秘感。 提交于 2019-12-22 07:56:05
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 5 years ago . Nowadays, super-scalar RISC cpus usually support out-of-order execution, with branch prediction and speculative execution. They schedule work dynamically. What's the advantage of compiler instruction scheduling, compared to an out-of-order CPU's dynamic scheduling? Does compile-time static scheduling matter at

Can compiler optimization elminate a function repeatedly called in a for-loop's conditional?

柔情痞子 提交于 2019-12-22 06:49:08
问题 I was reading about hash functions (i'm an intermediate CS student) and came across this: int hash (const string & key, int tableSize) { int hasVal = 0; for (int i = 0; i < key.length(); i++) hashVal = 37 * hashVal + key[i]; ..... return hashVal; } I was looking at this code and noticed that it would be faster if in the for-loop instead of calling key.length() each time we instead did this: int n = key.length(); for (int i = 0; i < n; i++) My question is, since this is such an obvious way to

Is Clang really this smart?

天大地大妈咪最大 提交于 2019-12-22 04:24:07
问题 If I compile the following code with Clang 3.3 using -O3 -fno-vectorize I get the same assembly output even if I remove the commented line. The code type puns all possible 32-bit integers to floats and counts the ones in a [0, 1] range. Is Clang's optimizer actually smart enough to realize that 0xFFFFFFFF when punned to float is not in the range [0, 1], so ignore the second call to fn entirely? GCC produces different code when the second call is removed. #include <limits> #include <cstring>