intel | 易学教程

OpenCL Intel Iris Integrated Graphics exits with Abort Trap 6: Timeout Issue

阅读更多关于 OpenCL Intel Iris Integrated Graphics exits with Abort Trap 6: Timeout Issue

问题 I am attempting to write a program that executes Monte Carlo simulations using OpenCL. I have run into an issue involving exponentials. When the value of the variable steps becomes large, approximately 20000, the calculation of the exponent fails unexpectedly, and the program quits with "Abort Trap: 6". This seems to be a bizarre error given that steps should not affect memory allocation. I have tried setting normal , alpha , and beta to 0 but this does not resolve the problem however

VEX prefixes encoding and SSE/AVX MOVUP(D/S) instructions

阅读更多关于 VEX prefixes encoding and SSE/AVX MOVUP(D/S) instructions

问题 I'm trying to understand the VEX prefix encoding for the SSE/AVX instructions. So please bear with me if I ask something simple. I have the following related questions. Let's take the MOVUP(D/S) instruction ( 0F 10 ). If I follow the 2-byte VEX prefix encoding correctly: The following two instruction encodings produce the same result: db 0fh, 10h, 00000000b ; movups xmm0,xmmword ptr [rax] db 0c5h, 11111000b, 10h, 00000000b ; vmovups xmm0,xmmword ptr [rax] As these two: db 066h, 0fh, 10h,

Enable/Disable Hardware Lock Elision

阅读更多关于 Enable/Disable Hardware Lock Elision

问题 I am using glibc 2.24 version. It has lock elision path included for pthread_mutex_lock implementation with Transactional Synchronization Extensions such as _xbegin() and _xend(). The hardware is supposed to support lock elision as hle CPU flag is for Hardware Lock Elision I think. The processor I am using is Intel(R) Xeon(R) Gold 6130 with Skylake architecture. First I wanted to disable Lock elision but when I run the program that uses pthread_mutex_lock , with perf stat -T to monitor

Enable/Disable Hardware Lock Elision

阅读更多关于 Enable/Disable Hardware Lock Elision

AVX/SSE round floats down and return vector of ints?

阅读更多关于 AVX/SSE round floats down and return vector of ints?

问题 Is there a way using AVX/SSE to take a vector of floats, round-down and produce a vector of ints? All the floor intrinsic methods seem to produce a final vector of floating point, which is odd because rounding produces an integer! 回答1: SSE has conversion from FP to integer with your choice of truncation (towards zero) or the current rounding mode (normally the IEEE default mode, nearest with tiebreaks rounding to even. Like nearbyint() , unlike round() where the tiebreak is away-from-0. If

AVX/SSE round floats down and return vector of ints?

阅读更多关于 AVX/SSE round floats down and return vector of ints?

Matching the intel codes to disassembly output

阅读更多关于 Matching the intel codes to disassembly output

问题 I'm starting to use the Intel reference page to look up and learn about the op codes (instead of asking everything on SO). I'd like to make sure that my understanding is OK and ask a few questions on the output between a basic asm program and the intel instruction codes. Here is the program I have to compare various mov instructions into the rax -ish register (is there a better way to say "rax" and its 32- 16- and 8- bit components?): .globl _start _start: movq $1, %rax # move immediate into

Matching the intel codes to disassembly output

阅读更多关于 Matching the intel codes to disassembly output

Would having the call stack grow upward make buffer overruns safer?

阅读更多关于 Would having the call stack grow upward make buffer overruns safer?

问题 Each thread has its own stack to store local variables. But stacks are also used to store return addresses when calling a function. In x86 assembly, esp points to the most-recently allocated end of the stack. Today, most CPUs have stack grow negatively. This behavior enables arbitrary code execution by overflowing the buffer and overwriting the saved return address. If the stack was to grow positively, such attacks would not be feasible. Is it safer to have the call stack grow upwards? Why

Would having the call stack grow upward make buffer overruns safer?

阅读更多关于 Would having the call stack grow upward make buffer overruns safer?