compiler-optimization

Weird behaviour of c# compiler due caching delegate

与世无争的帅哥 提交于 2019-12-05 21:53:22
问题 Suppose I have following program: static void SomeMethod(Func<int, int> otherMethod) { otherMethod(1); } static int OtherMethod(int x) { return x; } static void Main(string[] args) { SomeMethod(OtherMethod); SomeMethod(x => OtherMethod(x)); SomeMethod(x => OtherMethod(x)); } I cannot understand compiled il code (it uses too extra code). Here is simplified version: class C { public static C c; public static Func<int, int> foo; public static Func<int, int> foo1; static C() { c = new C(); } C(){

How do I stop GCC from optimizing this byte-for-byte copy into a memcpy call?

江枫思渺然 提交于 2019-12-05 21:41:00
问题 I have this code for memcpy as part of my implementation of the standard C library which copies memory from src to dest one byte at a time: void *memcpy(void *restrict dest, const void *restrict src, size_t len) { char *dp = (char *restrict)dest; const char *sp = (const char *restrict)src; while( len-- ) { *dp++ = *sp++; } return dest; } With gcc -O2 , the code generated is reasonable: memcpy: .LFB0: movq %rdi, %rax testq %rdx, %rdx je .L2 xorl %ecx, %ecx .L3: movzbl (%rsi,%rcx), %r8d movb

What can I assume about C/C++ compiler optimisations?

老子叫甜甜 提交于 2019-12-05 21:06:49
问题 I would like to know how to avoid wasting my time and risking typos by re-hashing source code when I'm integrating legacy code, library code or sample code into my own codebase. If I give a simple example, based on an image processing scenario, you might see what I mean. It's actually not unusual to find I'm integrating a code snippet like this: for (unsigned int y = 0; y < uHeight; y++) { for (unsigned int x = 0; x < uWidth; x++) { // do something with this pixel .... uPixel = pPixels[y *

Why does the compiler optimize away shared memory reads due to strncmp() even if volatile keyword is used?

爷,独闯天下 提交于 2019-12-05 20:14:58
Here is a program foo.c that writes data to shared memory. #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <string.h> #include <stdint.h> #include <unistd.h> #include <sys/ipc.h> #include <sys/shm.h> int main() { key_t key; int shmid; char *mem; if ((key = ftok("ftok", 0)) == -1) { perror("ftok"); return 1; } if ((shmid = shmget(key, 100, 0600 | IPC_CREAT)) == -1) { perror("shmget"); return 1; } printf("key: 0x%x; shmid: %d\n", key, shmid); if ((mem = shmat(shmid, NULL, 0)) == (void *) -1) { perror("shmat"); return 1; } sprintf(mem, "hello"); sleep(10); sprintf(mem, "exit");

Avoid .NET Native bugs

浪子不回头ぞ 提交于 2019-12-05 20:00:24
I spent the last year (part time) to migrate my existing (and successful) Windows 8.1 app to Windows 10 UWP. Now, just before releasing it to the store, I tested the app in the "Release" build mode (which triggers .NET Native). Everything seemed to work until I - by chance - noted a subtle but serious (because data-compromising) bug. It took me two days to reduce it to these three lines of code: var array1 = new int[1, 1]; var array2 = (int[,])array1.Clone(); array2[0, 0] = 666; if (array1[0, 0] != array2[0, 0]) { ApplicationView.GetForCurrentView().Title = "OK."; } else { ApplicationView

Why do common C compilers include the source filename in the output?

删除回忆录丶 提交于 2019-12-05 19:18:16
问题 I have learnt from this recent answer that gcc and clang include the source filename somewhere in the binary as metadata, even when debugging is not enabled. I can't really understand why this should be a good idea. Besides the tiny privacy risks, this happens also when one optimizes for the size of the resulting binary ( -Os ), which looks inefficient. Why do the compilers include this information? 回答1: The reason why GCC includes the filename is mainly for debugging purposes, because it

My C++ object file is too big

我的未来我决定 提交于 2019-12-05 17:29:09
I am working on a C++ program and the compiled object code from a single 1200-line file (which initializes a rather complex state machine) comes out to nearly a megabyte. What could be making the file so large? Is there a way I can find what takes space inside the object file? There can be several reasons when object files are bigger than they have to be at minimum: statically including dependent libraries building with debug information building with profiling information creating (extremely) complex data structures using templates (maybe recursive boost-structures) not turning on optimizing

How to compile and run an optimized Rust program with overflow checking enabled

僤鯓⒐⒋嵵緔 提交于 2019-12-05 16:04:00
问题 I'm writing a program that's quite compute heavy, and it's annoyingly slow to run in debug mode. My program is also plagued by integer overflows, because I'm reading data from u8 arrays and u8 type spreads to unexpected places via type inference, and Rust prefers to overflow rather than to promote integers to larger types. Building in release mode disables overflow checks: cargo run --release How can I build Rust executable with optimizations and runtime overflow checks enabled as well? 回答1:

Extent of GHC's optimization

六月ゝ 毕业季﹏ 提交于 2019-12-05 15:17:17
问题 I am not very familiar with the degree that Haskell/GHC can optimize code. Below I have a pretty "brute-force" (in the declarative sense) implementation of the n queens problem. I know it can be written more efficiently, but thats not my question. Its that this got me thinking about the GHC optimizations capabilities and limits. I have expressed it in what I consider a pretty straightforward declarative sense. Filter permutations of [1..n] that fulfill the predicate For all indices i,j s.t j

passing rvalue to non-ref parameter, why can't the compiler elide the copy?

爷,独闯天下 提交于 2019-12-05 14:38:31
struct Big { int a[8]; }; void foo(Big a); Big getStuff(); void test1() { foo(getStuff()); } compiles (using clang 6.0.0 for x86_64 on Linux so System V ABI, flags: -O3 -march=broadwell ) to test1(): # @test1() sub rsp, 72 lea rdi, [rsp + 40] call getStuff() vmovups ymm0, ymmword ptr [rsp + 40] vmovups ymmword ptr [rsp], ymm0 vzeroupper call foo(Big) add rsp, 72 ret If I am reading this correctly, this is what is happening: getStuff is passed a pointer to foo 's stack ( rsp + 40 ) to use for its return value, so after getStuff returns rsp + 40 through to rsp + 71 contains the result of