x86-64 | 易学教程

Meaning of double underscore in the beginning

阅读更多关于 Meaning of double underscore in the beginning

In the standard library (glibc) I see functions defined with leading double underscores, such as __mmap in sys/mman.h . What is the purpose? And how can we still call a function mmap which doesn't seem to be declared anywhere. I mean we include sys/mman.h for that, but sys/mman.h doesn't declare mmap , it declares only __mmap . From GNU's manual: In addition to the names documented in this manual, reserved names include all external identifiers (global functions and variables) that begin with an underscore (‘_’) and all identifiers regardless of use that begin with either two underscores or an

Repeated integer division by a runtime constant value

阅读更多关于 Repeated integer division by a runtime constant value

At some point in my program I compute an integer divisor d . From that point onward d is going to be constant. Later in the code I will divide by that d several times - performing an integer division, since the value of d is not a compile-time known constant. Given that integer division is a relatively slow process compared to other kind of integer arithmetic, I would like to optimize it. Is there some alternative format that I could store d in, so that the division process would perform faster? Maybe a reciprocal of some form? I do not need the value of d for anything else. The value of d is

Examining C/C++ Heap memory statistics in gdb

阅读更多关于 Examining C/C++ Heap memory statistics in gdb

I'm trying to investigate the state of the C/C++ heap from within gdb on Linux amd64, is there a nice way to do this? One approach I've tried is to "call mallinfo()" but unfortunately I can't then extract the values I want since gdb doesn't deal with the return value properly. I'm not easily able to write a function to be compiled into the binary for the process I am attached to, so I can simply implement my own function to extract the values by calling mallinfo() in my own code this way. Is there perhaps a clever trick that will allow me to do this on-the-fly? Another option could be to

CPU Cycle count based profiling in C/C++ Linux x86_64

阅读更多关于 CPU Cycle count based profiling in C/C++ Linux x86_64

I am using the following code to profile my operations to optimize on cpu cycles taken in my functions. static __inline__ unsigned long GetCC(void) { unsigned a, d; asm volatile("rdtsc" : "=a" (a), "=d" (d)); return ((unsigned long)a) | (((unsigned long)d) << 32); } I don't think it is the best since even two consecutive calls gives me a difference of "33". Any suggestions ? I personally think the rdtsc instruction is great and usable for a variety of tasks. I do not think that using cpuid is necessary to prepare for rdtsc. Here is how I reason around rdtsc: Since I use the Watcom compiler I

How to use gdb with LD_PRELOAD

阅读更多关于 How to use gdb with LD_PRELOAD

问题 I run a program with LD_PRELOADing a specific library. Like this. LD_PRELOAD=./my.so ./my_program How do I run this program with gdb ? 回答1: Do the following. gdb your_program (gdb) set environment LD_PRELOAD ./yourso.so (gdb) start 回答2: Posting because we ran into a case where set environment didn't work: From GDB documentation: set exec-wrapper wrapper show exec-wrapper unset exec-wrapper When ‘exec-wrapper’ is set, the specified wrapper is used to launch programs for debugging. gdb starts

What is the difference between x64 and IA-64?

阅读更多关于 What is the difference between x64 and IA-64?

I was on Microsoft's website and noticed two different installers, one for x64 and one for IA-64. Reference: Installing the .NET Framework 4.5, 4.5.1 My understanding is that IA-64 is a subclass of x64, so I'm curious why it would have a separate installer. x64 is used as a short term for the 64 bit extensions of the "classical" x86 architecture; almost any "normal" PC produced in the last years have a processor based on such architecture. AMD invented the AMD64 extensions; Intel was more or less forced to implement them, and called them first IA-32e, then EM64T and finally Intel 64 (actually,

What causes page faults?

阅读更多关于 What causes page faults?

According to Wikipedia : A page fault is a trap to the software raised by the hardware when a program accesses a page that is mapped in the virtual address space, but not loaded in physical memory . (emphasis mine) Okay, that makes sense. But if that's the case, why is it that whenever the process information in Process Hacker is refreshed, I see about 15 page faults? Or in other words, why is any memory getting paged out? (I have no idea if it's user or kernel memory.) I have no page file, and the RAM usage is about 1.2 GB out of 4 GB, which is after a clean reboot. There's no shortage of any

Intel x86 vs x64 system call

阅读更多关于 Intel x86 vs x64 system call

问题 I'm reading about the difference in assembly between x86 and x64. On x86, the system call number is placed in eax , then int 80h is executed to generate a software interrupt. But on x64, the system call number is placed in rax , then syscall is executed. I'm told that syscall is lighter and faster than generating a software interrupt. Why it is faster on x64 than x86, and can I make a system call on x64 using int 80h ? 回答1: General part EDIT: Linux irrelevant parts removed While not totally

To learn assembly - should I start with 32 bit or 64 bit?

阅读更多关于 To learn assembly - should I start with 32 bit or 64 bit?

I'm really wanting to learn assembly. I'm pretty good at c/c++, but want a better understanding of what's going on at a lower level. I realize that assembly related questions have been asked before, but I'm just looking for some direction that's particular to my situation: I'm running windows 7, and am confused about how I should start working with assembly. Do I have to start with x64 because I'm running windows 7? Some people have said 'start with 32 bit first' - how do I go about doing this? What does my operating system have to do with my ability to write assembly for '32' or '64' bit. In

Why would introducing useless MOV instructions speed up a tight loop in x86_64 assembly?

阅读更多关于 Why would introducing useless MOV instructions speed up a tight loop in x86_64 assembly?

Background: While optimizing some Pascal code with embedded assembly language, I noticed an unnecessary MOV instruction, and removed it. To my surprise, removing the un-necessary instruction caused my program to slow down . I found that adding arbitrary, useless MOV instructions increased performance even further. The effect is erratic, and changes based on execution order: the same junk instructions transposed up or down by a single line produce a slowdown . I understand that the CPU does all kinds of optimizations and streamlining, but, this seems more like black magic. The data: A version