x86-64

How can I change the device on wich OpenCL-code will be executed with Umat in OpenCV?

邮差的信 提交于 2019-11-29 05:22:33
As known, OpenCV 3.0 supports new class cv::Umat which provides Transparent API (TAPI) to use OpenCL automaticaly if it can: http://code.opencv.org/projects/opencv/wiki/Opencv3#tapi There are two indtroductions to the cv::Umat and TAPI: Intel: https://software.intel.com/en-us/articles/opencv-30-architecture-guide-for-intel-inde-opencv AMD: http://developer.amd.com/community/blog/2014/10/15/opencv-3-0-transparent-api-opencl-acceleration/ But if I have: Intel CPU Core i5 (Haswell) 4xCores (OpenCL Intel CPUs with SSE 4.1, SSE 4.2 or AVX support ) Intel Integrated HD Graphics which supports OpenCL

Why does printf print random value with float and integer format specifier

心不动则不痛 提交于 2019-11-29 05:08:48
I wrote a simple code on a 64 bit machine int main() { printf("%d", 2.443); } So, this is how the compiler will behave. It will identify the second argument to be a double hence it will push 8 bytes on the stack or possibly just use registers across calls to access the variables. %d expects a 4 byte integer value, hence it prints some garbage value. What is interesting is that the value printed changes everytime I execute this program. So what is happening? I expected it to print the same garbage value everytime not different ones everytime. It's undefined behaviour, of course, to pass

How to get a “backtrace” (like gdb) using only ptrace (linux, x86/x86_64)

一个人想着一个人 提交于 2019-11-29 05:06:31
I want to get a backtrace -like output as gdb does. But I want to do this via ptrace() directly. My platform is Linux, x86; and, later x86_64. Now I want only to read return addresses from the stack, without conversion into symbol names. So, for test program, compiled in -O0 mode by gcc-4.5 : int g() { kill(getpid(),SIGALRM); } int f() { int a; int b; a = g(); b = a; return a+b; } int e() { int c; c = f(); } main() { return e(); } I will start a my program and connect with ptrace to test program at very beginning. Then, I will do PTRACE_CONT and will wait for signal. When test program will do

What is long double on x86-64?

心不动则不痛 提交于 2019-11-29 04:26:37
Someone told me that: Under x86-64, FP arithmetic is done with SSE, and therefore long double is 64 bits. But in the x86-64 ABI it says that: C type | sizeof | alignment | AMD64 Architecture long double | 16 | 16 | 80-bit extended (IEEE-754) See: amd64-abi.pdf and gcc says sizeof(long double) is 16 and gives FLT_DBL = 1.79769e+308 and FLT_LDBL = 1.18973e+4932 So I'm confused, how is long double 64 bit? I thought it is an 80-bit representation. Under x86-64, FP arithmetic is done with SSE, and therefore long double is 64 bits. That's what usually happens under x86-64 (where the presence of SSE

x86_64: Is it possible to “in-line substitute” PLT/GOT references?

风格不统一 提交于 2019-11-29 02:26:02
I'm not sure what a good subject line for this question is, but here we go ... In order to force code locality / compactness for a critical section of code, I'm looking for a way to call a function in an external (dynamically-loaded) library through a "jump slot" (an ELF R_X86_64_JUMP_SLOT relocation) directly at the call site - what the linker ordinarily puts into PLT / GOT, but have these inlined right at the call site. If I emulate the call like: #include <stdio.h> int main(int argc, char **argv) { asm ("push $1f\n\t" "jmp *0f\n\t" "0: .quad %P0\n" "1:\n\t" : : "i"(printf), "D"("Hello,

Python ctypes and function calls

主宰稳场 提交于 2019-11-29 02:20:15
My friend produced a small proof-of-concept assembler that worked on x86. I decided to port it for x86_64 as well, but I immediately hit a problem. I wrote a small piece of program in C, then compiled and objdumped the code. After that I inserted it to my python script, therefore the x86_64 code is correct: from ctypes import cast, CFUNCTYPE, c_char_p, c_long buffer = ''.join(map(chr, [ #0000000000000000 <add>: 0x55, # push %rbp 0x48, 0x89, 0xe5, # mov %rsp,%rbp 0x48, 0x89, 0x7d, 0xf8, # mov %rdi,-0x8(%rbp) 0x48, 0x8b, 0x45, 0xf8, # mov -0x8(%rbp),%rax 0x48, 0x83, 0xc0, 0x0a, # add $0xa,%rax

Is increment an integer atomic in x86? [duplicate]

本秂侑毒 提交于 2019-11-29 02:18:28
问题 This question already has answers here : Can num++ be atomic for 'int num'? (13 answers) Closed 3 years ago . On a multicore x86 machine, Say a thread executing on core1 increments an integer variable a at the same time thread on core 2 also increments it. Given that the initial value of a was 0, would it always be 2 in the end? Or it could have some other value? Assume that a is declared as volatile and we are not using atomic variables (such as atomic<> of C++ and built in atomic operations

How to get `gcc` to generate `bts` instruction for x86-64 from standard C?

倾然丶 夕夏残阳落幕 提交于 2019-11-29 01:23:52
Inspired by a recent question , I'd like to know if anyone knows how to get gcc to generate the x86-64 bts instruction (bit test and set) on the Linux x86-64 platforms, without resorting to inline assembly or to nonstandard compiler intrinsics. Related questions: Why doesn't gcc do this for a simple |= operation were the right-hand side has exactly 1 bit set? How to get bts using compiler intrinsics or the asm directive Portability is more important to me than bts , so I won't use and asm directive, and if there's another solution, I prefer not to use compiler instrinsics. EDIT : The C source

Basic OS X Assembly and the Mach-O format

六月ゝ 毕业季﹏ 提交于 2019-11-29 00:39:34
I am interested in programming in x86-64 assembly on the Mac OS X platform. I came across this page about creating a 248B Mach-O program , which led me to Apple's own Mach-O format reference . After that I thought I'd make that same simple C program in Xcode and check out the generated assembly. This was the code: int main(int argc, const char * argv[]) { return 42; } But the assembly generated was 334 lines, containing (based on the 248B model) a lot of excess content. Firstly, why is so much DWARF debug info included in the Release build of a C executable? Secondly, I notice the Mach-O

Find which assembly instruction caused an Illegal Instruction error without debugging

送分小仙女□ 提交于 2019-11-28 21:05:27
问题 While running a program I've written in assembly, I get Illegal instruction error. Is there a way to know which instruction is causing the error, without debugging that is, because the machine I'm running on does not have a debugger or any developement system. In other words, I compile in one machine and run on another. I cannot test my program on the machine I'm compiling because they don't support SSE4.2. The machine I'm running the program on does support SSE4.2 instructions nevertheless.