intel

How do I get a full assembly code from c file?

♀尐吖头ヾ 提交于 2019-12-20 05:30:23
问题 I'm currently trying to figure out the way to produce equivalent assembly code from corresponding c source file. (pardon me for not being able to speak English fluently, I'm new to English.) I've been using C language for several years, but have little experience to assembly language. I was able to output the assembly code using S option in gcc. However, the resulted assembly code contained call instruction which in turn makes a jump to another functions like _exp . This is not what I wanted,

Intel compiler (C++) issue with OpenMP reduction on std::vector

╄→尐↘猪︶ㄣ 提交于 2019-12-19 19:51:54
问题 Since OpenMP 4.0, user-defined reduction is supported. So I defined the reduction on std::vector in C++ exactly from here. It works fine with GNU/5.4.0 and GNU/6.4.0, but it returns random values for the reduction with intel/2018.1.163. This is the example: #include <iostream> #include <vector> #include <algorithm> #include "omp.h" #pragma omp declare reduction(vec_double_plus : std::vector<double> : \ std::transform(omp_out.begin(), omp_out.end(), omp_in.begin(), omp_out.begin(), std::plus

How can I write a QuadWord from AVX512 register zmm26 to the rax register?

孤街浪徒 提交于 2019-12-19 17:36:30
问题 I wish to perform integer arithmetic operations on Quad Word elements of the zmm 0-31 register set and preserve the carry bit resulting from those operations. It appears this is only possible if the data were worked on in the general register set. Thus I would like to copy information from one of the zmm 0-31 registers to one of the general purpose registers. After working on the 64 bit data in the general purpose register, I would like to return the data to the original zmm 0-31 register in

perf-report show value of CPU register

穿精又带淫゛_ 提交于 2019-12-19 10:16:07
问题 I follow this document and using perf record with --intr-regs=ax,bx,r15 , trying to log additional CPU register information with PEBS record. But how do I view those info from perf.data? The original command is perf report , and it only shows a few fields such as overhead, command, shared object and symbol. Is there any way to show CPU regs' value? 回答1: Try perf script data dumping command with the iregs field: perf script -F ip,sym,iregs . All fields -F are documented with source code of

perf-report show value of CPU register

旧时模样 提交于 2019-12-19 10:15:30
问题 I follow this document and using perf record with --intr-regs=ax,bx,r15 , trying to log additional CPU register information with PEBS record. But how do I view those info from perf.data? The original command is perf report , and it only shows a few fields such as overhead, command, shared object and symbol. Is there any way to show CPU regs' value? 回答1: Try perf script data dumping command with the iregs field: perf script -F ip,sym,iregs . All fields -F are documented with source code of

What is the impact SFENCE and LFENCE to caches of neighboring cores?

安稳与你 提交于 2019-12-19 10:03:58
问题 From the speech Herb Sutter in the figure of the slides on page 2: https://skydrive.live.com/view.aspx?resid=4E86B0CF20EF15AD!24884&app=WordPdf&wdo=2&authkey=!AMtj_EflYn2507c Here are shown separate cache- L1S and Store Buffer ( SB ). 1. In processors Intel x86 cache-L1 and Store Buffer - is the same thing? And next slide: As we see from the next slide in the x86 is only possible following reordering. was: MOV eax, [memory1] / / read MOV [memory2], edx / / write ... / / MOV, MFENCE, ADD ...

making mistake in inline assembler in gcc [duplicate]

时光怂恿深爱的人放手 提交于 2019-12-19 08:17:23
问题 This question already has answers here : How to get the CPU cycle count in x86_64 from C++? (4 answers) Closed last year . I have successfully written some inline assembler in gcc to rotate right one bit following some nice instructions: http://www.cs.dartmouth.edu/~sergey/cs108/2009/gcc-inline-asm.pdf Here's an example: static inline int ror(int v) { asm ("ror %0;" :"=r"(v) /* output */ :"0"(v) /* input */ ); return v; } However, I want code to count clock cycles, and have seen some in the

The inner workings of Spectre (v2)

谁说胖子不能爱 提交于 2019-12-19 08:12:35
问题 I have done some reading about Spectre v2 and obviously you get the non technical explanations. Peter Cordes has a more in-depth explanation but it doesn't fully address a few details. Note: I have never performed a Spectre v2 attack so I do not have hands on experience. I have only read up about about the theory. My understanding of Spectre v2 is that you make an indirect branch mispredict for instance if (input < data.size) . If the Indirect Target Array (which I'm not too sure of the

The inner workings of Spectre (v2)

删除回忆录丶 提交于 2019-12-19 08:12:08
问题 I have done some reading about Spectre v2 and obviously you get the non technical explanations. Peter Cordes has a more in-depth explanation but it doesn't fully address a few details. Note: I have never performed a Spectre v2 attack so I do not have hands on experience. I have only read up about about the theory. My understanding of Spectre v2 is that you make an indirect branch mispredict for instance if (input < data.size) . If the Indirect Target Array (which I'm not too sure of the

Understanding %rip register in intel assembly

余生颓废 提交于 2019-12-19 06:02:25
问题 Concerning the following small code, which was illustrated in another post about the size of structure and all the possibilities to align data correctly : struct { char Data1; short Data2; int Data3; char Data4; } x; unsigned fun ( void ) { x.Data1=1; x.Data2=2; x.Data3=3; x.Data4=4; return(sizeof(x)); } I get the corresponding disassembly (with 64 bits) 0000000000000000 <fun>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: c6 05 00 00 00 00 01 movb $0x1,0x0(%rip) # b <fun+0xb> b: 66 c7 05 00