x86-64

Meaning of REX.w prefix before AMD64 jmp (FF25)

我们两清 提交于 2019-12-01 01:41:35
问题 While solving a bug I came across a difference between import jump tables of two Win64 DLLs. 64bit version of kernel32.dll uses plain FF25 jmp instruction in its import jump tables. On the other hand 64bit version of advapi32.dll uses 48FF25 which indicates REX.w=1 prefix before the jmp opcode. However, both seem to have 32bit operand specifying a RIP+offset address. Is there any meaning for REX.w prefix on this specific opcode? I'm not working with machine code often, so please excuse any

Constraining r10 register in gcc inline x86_64 assembly

末鹿安然 提交于 2019-12-01 00:51:53
问题 I'm having a go at writing a very light weight libc replacement library so that I can better understand the kernel - application interface. The first task is clearly getting some system call wrappers in place. I've successfully got 1 to 3 argument wrappers working but I'm struggling with a 4 argument varient. Here's my starting point: long _syscall4(long type, long a1, long a2, long a3, long a4) { long ret; asm ( "syscall" : "=a"(ret) : "a"(type), "D"(a1), "S"(a2), "d"(a3), "r10"(a4) : "c",

Cassandra Startup Error 1.2.6 on Linux x86_64

随声附和 提交于 2019-12-01 00:48:42
Trying to install cassandra on linux from latest stable release - http://cassandra.apache.org/download/ - 1.2.6 I have modified the cassndra.yaml to point to a custom directory instead of /var since I do not have write access on /var I am seeing this error on startup. Not able to find any answers on google yet since the release seems relatively new. Just posting it here in case its a silly mistake on my side. Same distribution file worked fine on my macos x86_64 machine. INFO 19:24:35,513 Not using multi-threaded compaction java.lang.reflect.InvocationTargetException at sun.reflect

Precompiled headers and compiling universal objects on OSX

本小妞迷上赌 提交于 2019-12-01 00:46:27
问题 We are using precompiled headers with GCC for our project and build them like this: gcc $(CFLAGS) precompiledcommonlib.h Now I'm building the project on OSX 10.6 and trying to use the nifty feature of building for all architectures at the same time like this: gcc $(CFLAGS) -c -arch i386 -arch x86_64 commonlib.c However, it seems this does not work for the precompiled headers: gcc $(CFLAGS) -arch i386 -arch x86_64 precompiledcommonlib.h Undefined symbols for architecture i386: "_main",

IBM Mobile First - Json Store not working on Samsung Galaxy S6

落花浮王杯 提交于 2019-12-01 00:20:24
We're building a hybrid app with IBM Mobile First Platform (7.0) for iOS and Android platforms. We're using JSONStore to save user non-confidential data (we're not cyphering the data stored). When we deploy the application to a Samsung Galaxy S6 (Model SM-G920I) we're having this error on the init method of the Json Store: Error code: -11 OPERATION_FAILED_ON_SPECIFIC_DOCUMENT IBM Mobile First Platform - JSONStore errors Error details : "dlopen failed: "/data/data/com.MyMobileApp/files/libcrypto.so.1.0.0" is 32-bit instead of 64-bit" Making some research, we cannot figure out something else

What is the compatible subset of Intel's and AMD's x86-64 implementations?

▼魔方 西西 提交于 2019-11-30 22:15:10
While learning x86-64 assembly, I came across my first incompatibility between Intel 64 and AMD64 implementations of "x86-64": Why does syscall compile in NASM 32 bit output while popa does not compile in 64 bit? syscall is valid in the compatibility mode of one but not the other. Is there a better way of finding out those incompatibilities besides carefully reading both manuals and comparing them, which is error prone and duplicates my manual reading effort when aiming for portability? For example, it would be much easier if there was either: a standard subset which both Intel and AMD claim

Under what conditions do I need to set up SEH unwind info for an x86-64 assembly function?

谁都会走 提交于 2019-11-30 20:44:47
The 64-bit Windows ABI defines a generalized exception handling mechanism , which I believe is shared across C++ exceptions and structured exceptions available even in other languages such as C. If I'm writing an x86-64 assembly routine to be compiled in nasm and linked into a C or C++ library, what accommodations do I need make on Windows in terms of generating unwind info and so on? I'm not planning on generating any exceptions directly in the assembly code, although I suppose it is possible that the code may get an access violation if a user-supplied buffer is invalid, etc. I'd like the

SIMD versions of SHLD/SHRD instructions

你离开我真会死。 提交于 2019-11-30 20:44:18
SHLD/SHRD instructions are assembly instructions to implement multiprecisions shifts. Consider the following problem: uint64_t array[4] = {/*something*/}; left_shift(array, 172); right_shift(array, 172); What is the most efficient way to implement left_shift and right_shift , two functions that operates a shift on an array of four 64-bit unsigned integer as if it was a big 256 bits unsigned integer? Is the most efficient way of doing that is by using SHLD/SHRD instructions, or is there better (like SIMD versions) instructions on modern architecture? In this answer I'm only going to talk about

Issue storing a byte into a register x86-64 assembly

丶灬走出姿态 提交于 2019-11-30 20:21:43
问题 I am trying to write a function that determines the length of a string given as the first argument, so %rdi will contain char *ptr. When I call movb (%rdi),%rcx to move the character pointed to by %rdi into %rcx, I get the following error: incorrect register '%rdx' used with 'b' suffix As I understand it, only certain registers can hold a byte in x86-64, so which ones can I use to move the byte into? Or is the method I am using to extract the character at each byte in the string incorrect?

Can counting byte matches between two strings be optimized using SIMD?

三世轮回 提交于 2019-11-30 19:26:47
Profiling suggests that this function here is a real bottle neck for my application: static inline int countEqualChars(const char* string1, const char* string2, int size) { int r = 0; for (int j = 0; j < size; ++j) { if (string1[j] == string2[j]) { ++r; } } return r; } Even with -O3 and -march=native , G++ 4.7.2 does not vectorize this function (I checked the assembler output). Now, I'm not an expert with SSE and friends, but I think that comparing more than one character at once should be faster. Any ideas on how to speed things up? Target architecture is x86-64. Sam Compiler flags for