x86-64 | 易学教程

My (AT&T) assembly (x86-x64) code should increment but doesn't

阅读更多关于 My (AT&T) assembly (x86-x64) code should increment but doesn't

问题 I'm trying to make a small program in assembly (for AT&T). I'm trying to get an input from the user in the form of an integer, increment it after that and then output the incremented value. However, the value doesn't increment. I've spent the last hours trying everything I could come up with, but it still doesn't work, so I have the idea that I maybe understand a concept in assembly not well, causing me to not spot the mistake. This is my code: 1 hiString: .asciz "Hi\n" 2 formatstr: .asciz "

How to interpret this address -0x80(%rbp,%rax,4)

阅读更多关于 How to interpret this address -0x80(%rbp,%rax,4)

问题 I'm currently trying to learn assembly language (and the effects of different compiler options) by analyzing simple C code snippets. Now I stumpled across the following instruction: mov %edx,-0x80(%rbp,%rax,4) What I do not understand is the expression for the target address -0x80(%rbp,%rax,4) . The instruction assigns a value to a local array in a loop. 回答1: The machine command will copy the content of %edx to the address given by %rbp + 4 * %rax - 0x80 . It seems %rax is holding the index

Atomically clearing lowest non-zero bit of an unsigned integer

阅读更多关于 Atomically clearing lowest non-zero bit of an unsigned integer

问题 Question: I'm looking for the best way to clear the lowest non-zero bit of a unsigned atomic like std::atomic_uint64_t in a threadsafe fashion without using an extra mutex or the like. In addition, I also need to know, which bit got cleared. Example: Lets say, if the current value stored is 0b0110 I want to know that the lowest non-zero bit is bit 1 (0-indexed) and set the variable to 0b0100 . The best version I came up with is this: #include <atomic> #include <cstdint> inline uint64_t with

How to optimize function return values in C and C++ on x86-64?

阅读更多关于 How to optimize function return values in C and C++ on x86-64?

问题 The x86-64 ABI specifies two return registers: rax and rdx , both 64-bits (8 bytes) in size. Assuming that x86-64 is the only targeted platform, which of these two functions: uint64_t f(uint64_t * const secondReturnValue) { /* Calculate a and b. */ *secondReturnValue = b; return a; } std::pair<uint64_t, uint64_t> g() { /* Calculate a and b, same as in f() above. */ return { a, b }; } would yield better performance, given the current state of C/C++ compilers targeting x86-64? Are there any

How can I change the device on wich OpenCL-code will be executed with Umat in OpenCV?

阅读更多关于 How can I change the device on wich OpenCL-code will be executed with Umat in OpenCV?

问题 As known, OpenCV 3.0 supports new class cv::Umat which provides Transparent API (TAPI) to use OpenCL automaticaly if it can: http://code.opencv.org/projects/opencv/wiki/Opencv3#tapi There are two indtroductions to the cv::Umat and TAPI: Intel: https://software.intel.com/en-us/articles/opencv-30-architecture-guide-for-intel-inde-opencv AMD: http://developer.amd.com/community/blog/2014/10/15/opencv-3-0-transparent-api-opencl-acceleration/ But if I have: Intel CPU Core i5 (Haswell) 4xCores

What is long double on x86-64?

阅读更多关于 What is long double on x86-64?

问题 Someone told me that: Under x86-64, FP arithmetic is done with SSE, and therefore long double is 64 bits. But in the x86-64 ABI it says that: C type | sizeof | alignment | AMD64 Architecture long double | 16 | 16 | 80-bit extended (IEEE-754) See: amd64-abi.pdf and gcc says sizeof(long double) is 16 and gives FLT_DBL = 1.79769e+308 and FLT_LDBL = 1.18973e+4932 So I'm confused, how is long double 64 bit? I thought it is an 80-bit representation. 回答1: Under x86-64, FP arithmetic is done with SSE

x64: Why does this piece of code give me “Address boundary error”

阅读更多关于 x64: Why does this piece of code give me “Address boundary error”

问题 Why does the following x64 assembly give me "Address boundary error"? It only happens when I add code after call _print_string . I assume that some of the register have been modified but aren't they supposed to be reverted once the _print_string function returns? I am using Mac OS X obj_size = 8 .data hello_world: .asciz "hello world!" .text .globl _main _main: pushq %rbp movq %rsp, %rbp leaq hello_world(%rip), %rdi callq _print_string subq obj_size, %rsp movq 1, %rax movq %rax, obj_size(%rsp

x86-64 Big Integer Representation?

阅读更多关于 x86-64 Big Integer Representation?

问题 How do hig-performance native big-integer libraries on x86-64 represent a big integer in memory? (or does it vary? Is there a most common way?) Naively I was thinking about storing them as 0-terminated strings of numbers in base 2 64 . For example suppose X is in memory as: [8 bytes] Dn . . [8 bytes] D2 [8 bytes] D1 [8 bytes] D0 [8 bytes] 0 Let B = 2 64 Then X = D n * B n + ... + D 2 * B 2 + D 1 * B 1 + D 0 The empty string (i.e. 8 bytes of zero) means zero. Is this a reasonable way? What are

When should I use size directives in x86?

阅读更多关于 When should I use size directives in x86?

问题 When to use size directives in x86 seems a bit ambiguous. This x86 assembly guide says the following: In general, the intended size of the of the data item at a given memory address can be inferred from the assembly code instruction in which it is referenced. For example, in all of the above instructions, the size of the memory regions could be inferred from the size of the register operand. When we were loading a 32-bit register, the assembler could infer that the region of memory we were

Can you enter x64 32-bit “long compatibility sub-mode” outside of kernel mode?

阅读更多关于 Can you enter x64 32-bit “long compatibility sub-mode” outside of kernel mode?

问题 This might be an exact duplicate of Is it possible to execute 32-bit code in 64-bit process by doing mode-switching?, but that question is from a year ago and only has one answer that doesn't give any source code. I'm hoping for more detailed answers. I'm running 64-bit Linux (Ubuntu 12.04, if it matters). Here's some code that allocates a page, writes some 64-bit code into it, and executes that code. #include <assert.h> #include <malloc.h> #include <stdio.h> #include <sys/mman.h> // mprotect