compiler-optimization | 易学教程

Do compilers optimize switches differently than long if-then-else chains?

阅读更多关于 Do compilers optimize switches differently than long if-then-else chains?

问题 Suppose I have N different integral values known at compile time, V_1 through V_N. Consider the following structures: const int x = foo(); switch(x) { case V_1: { /* commands for V_1 which don't change x */ } break; case V_2: { /* commands for V_1 which don't change x */ } break; /* ... */ case V_N: { /* commands for V_1 which don't change x */ } break; } versus const int x = foo(); if (x == V_1) { /* commands for V_1 which don't change x */ } else if (x == V_2) { /* commands for V_2 which

Which AVX and march should be specified on a cluster with different architectures?

阅读更多关于 Which AVX and march should be specified on a cluster with different architectures?

问题 I'm currently trying to compile software for the use on a HPC-Cluster using Intel compilers. The login-node, which is where I compile and prepare the computations uses Intel Xeon Gold 6148 Processors, while the compute nodes use either Haswell- (Intel Xeon E5-2660 v3 / Intel Xeon Processor E5-2680 v3) or Skylake-processors (Intel Xeon Gold 6138). As far as I understand from the links above, my login-node supports Intel SSE4.2, Intel AVX, Intel AVX2, as well as Intel AVX-512 but my compute

Which AVX and march should be specified on a cluster with different architectures?

阅读更多关于 Which AVX and march should be specified on a cluster with different architectures?

Which AVX and march should be specified on a cluster with different architectures?

阅读更多关于 Which AVX and march should be specified on a cluster with different architectures?

Which AVX and march should be specified on a cluster with different architectures?

阅读更多关于 Which AVX and march should be specified on a cluster with different architectures?

Allowing struct field to overflow to the next field

阅读更多关于 Allowing struct field to overflow to the next field

问题 Consider the following simple example: struct __attribute__ ((__packed__)) { int code[1]; int place_holder[100]; } s; void test(int n) { int i; for (i = 0; i < n; i++) { s.code[i] = 1; } } The for-loop is writing to the field code , which is of size 1. The next field after code is place_holder . I would expect that in case of n > 1 , the write to code array would overflow and 1 would be written to place_holder . However, when compiling with -O2 (on gcc 4.9.4 but probably on other versions as

Allowing struct field to overflow to the next field

阅读更多关于 Allowing struct field to overflow to the next field

Does clang offer anything similar to GCC 6.x's function multi-versioning (target_clones)?

阅读更多关于 Does clang offer anything similar to GCC 6.x's function multi-versioning (target_clones)?

问题 I've read this LWN article with great interest. Executive summary: GCC 6.x supports something called function multi-versioning which builds multiple versions of the same function, optimized for different instruction sets. Let's say you have a machine with AVX2 support and one without. It's possible to run the same binary on both, with function foo() existing in two versions, one of which uses AVX2 instructions. The function with the AVX2 instructions are, however, only called if the CPU

Will my compiler ignore useless code?

阅读更多关于 Will my compiler ignore useless code?

问题 I've been through a few questions over the network about this subject but I didn't find any answer for my question, or it's for another language or it doesn't answer totally (dead code is not useless code) so here's my question: Is (explicit or not) useless code ignored by the compiler? For example, in this code: double[] TestRunTime = SomeFunctionThatReturnDoubles; // A bit of code skipped int i = 0; for (int j = 0; j < TestRunTime.Length; j++) { } double prevSpec_OilCons = 0; will the for

Can the compiler optimize from heap to stack allocation?

阅读更多关于 Can the compiler optimize from heap to stack allocation?

问题 As far as compiler optimizations go, is it legal and/or possible to change a heap allocation to a stack allocation? Or would that break the as-if rule? For example, say this is the original version of the code { Foo* f = new Foo(); f->do_something(); delete f; } Would a compiler be able to change this to the following { Foo f{}; f.do_something(); } I wouldn't think so, because that would have implications if the original version was relying on things like custom allocators. Does the standard