x86-64

How to tell length of an x86-64 instruction opcode using CPU itself?

亡梦爱人 提交于 2020-06-08 12:19:13
问题 I know that there are libraries that can "parse" binary machine code / opcode to tell the length of an x86-64 CPU instruction. But I'm wondering, since CPU has internal circuitry to determine this, is there a way to use processor itself to tell the instruction size from a binary code? (Maybe even a hack?) 回答1: The Trap Flag (TF) in EFLAGS/RFLAGS makes the CPU single-step, i.e. take an exception after running one instruction. So if you write a debugger, you can use the CPU's single-stepping

How to tell length of an x86-64 instruction opcode using CPU itself?

房东的猫 提交于 2020-06-08 12:19:06
问题 I know that there are libraries that can "parse" binary machine code / opcode to tell the length of an x86-64 CPU instruction. But I'm wondering, since CPU has internal circuitry to determine this, is there a way to use processor itself to tell the instruction size from a binary code? (Maybe even a hack?) 回答1: The Trap Flag (TF) in EFLAGS/RFLAGS makes the CPU single-step, i.e. take an exception after running one instruction. So if you write a debugger, you can use the CPU's single-stepping

How to tell length of an x86-64 instruction opcode using CPU itself?

我是研究僧i 提交于 2020-06-08 12:18:12
问题 I know that there are libraries that can "parse" binary machine code / opcode to tell the length of an x86-64 CPU instruction. But I'm wondering, since CPU has internal circuitry to determine this, is there a way to use processor itself to tell the instruction size from a binary code? (Maybe even a hack?) 回答1: The Trap Flag (TF) in EFLAGS/RFLAGS makes the CPU single-step, i.e. take an exception after running one instruction. So if you write a debugger, you can use the CPU's single-stepping

How to execute a call instruction with a 64-bit absolute address?

旧城冷巷雨未停 提交于 2020-06-08 06:14:05
问题 I am trying to call a function - that should have an absolute address when compiled and linked - from machine code. I am creating a function pointer to the desired function and trying to pass that to the call instruction, but I noticed that the call instruction takes at most a 16 or 32-bit address. Is there a way to call an absolute 64-bit address? I am deploying for the x86-64 architecture and using NASM to generate the machine code. I could work with a 32-bit address if I could be

Is an extra move somehow faster when doing division-by-multiplication?

霸气de小男生 提交于 2020-05-26 17:24:11
问题 Consider this function: unsigned long f(unsigned long x) { return x / 7; } With -O3 , Clang turns the division into a multiplication, as expected: f: # @f movabs rcx, 2635249153387078803 mov rax, rdi mul rcx sub rdi, rdx shr rdi lea rax, [rdi + rdx] shr rax, 2 ret GCC does basically the same thing, except for using rdx where Clang uses rcx . But they both appear to be doing an extra move. Why not this instead? f: movabs rax, 2635249153387078803 mul rdi sub rdi, rdx shr rdi lea rax, [rdi + rdx

Convert float to int64_t while preserving ordering

。_饼干妹妹 提交于 2020-05-13 23:41:07
问题 My question is similar to this question which deals with positive floating point values. In my case, I'm dealing with both positive and negative float values, and want to store it in an int64_t type. NOTE: I wish to use memcpy rather than relying on a union (which is UB in C++). 回答1: As described in my comment on the linked question about a 32-bit variant: ...basically you can either use a signed int32 and invert the low 31 bits if the sign bit is set. A similar approach works if you want

Convert float to int64_t while preserving ordering

喜夏-厌秋 提交于 2020-05-13 23:39:32
问题 My question is similar to this question which deals with positive floating point values. In my case, I'm dealing with both positive and negative float values, and want to store it in an int64_t type. NOTE: I wish to use memcpy rather than relying on a union (which is UB in C++). 回答1: As described in my comment on the linked question about a 32-bit variant: ...basically you can either use a signed int32 and invert the low 31 bits if the sign bit is set. A similar approach works if you want

Why does Linux favor 0x7f mappings?

陌路散爱 提交于 2020-05-13 06:30:08
问题 By running a simple less /proc/self/maps I see that most mappings start with 55 and 7F . I also noticed these ranges to be used whenever I debug any binary. In addition this comment here suggests that the kernel has indeed some range preference. Why is that? Is there some deeper technical reason for the above ranges? Will there be a problem if I manually mmap pages outside of these prefixes? 回答1: First and foremost, assuming that you are talking about x86-64, we can see that the virtual

Linux Kernel: manually modify page table entry flags

时光总嘲笑我的痴心妄想 提交于 2020-05-13 05:33:48
问题 I am trying to manually mark a certain memory region of a userspace process as non-cacheable (for educational purposes, not intended to be used in production code) by setting a flag in the respective page table entries. I have an Ubuntu 14.04 (ASLR disabled) with a 4.4 Linux kernel running on an x86_64 Intel Skylake processor. In my kernel module I have the following function: /* * Set memory region [start,end], excluding 'addr', of process with PID 'pid' as uncacheable. */ ssize_t set

Find which pages are no longer shared with copy-on-write

£可爱£侵袭症+ 提交于 2020-05-11 01:02:35
问题 Say I have a process in Linux from which I fork() another identical process. After fork ing, as the original process will start writing to memory, the Linux copy-on-write mechanism will give the process unique physical memory pages which are different from the one used by the forked process. How can I, at some point of execution, know which pages of the original process have been copied-on-write? I don't want to use SIGSEGV signal handler and give read only access to all the pages in the