intel

OpenCL Intel Iris Integrated Graphics exits with Abort Trap 6: Timeout Issue

两盒软妹~` 提交于 2021-02-07 19:19:16
问题 I am attempting to write a program that executes Monte Carlo simulations using OpenCL. I have run into an issue involving exponentials. When the value of the variable steps becomes large, approximately 20000, the calculation of the exponent fails unexpectedly, and the program quits with "Abort Trap: 6". This seems to be a bizarre error given that steps should not affect memory allocation. I have tried setting normal , alpha , and beta to 0 but this does not resolve the problem however

VEX prefixes encoding and SSE/AVX MOVUP(D/S) instructions

Deadly 提交于 2021-02-07 13:50:22
问题 I'm trying to understand the VEX prefix encoding for the SSE/AVX instructions. So please bear with me if I ask something simple. I have the following related questions. Let's take the MOVUP(D/S) instruction ( 0F 10 ). If I follow the 2-byte VEX prefix encoding correctly: The following two instruction encodings produce the same result: db 0fh, 10h, 00000000b ; movups xmm0,xmmword ptr [rax] db 0c5h, 11111000b, 10h, 00000000b ; vmovups xmm0,xmmword ptr [rax] As these two: db 066h, 0fh, 10h,

Enable/Disable Hardware Lock Elision

蹲街弑〆低调 提交于 2021-02-07 13:34:13
问题 I am using glibc 2.24 version. It has lock elision path included for pthread_mutex_lock implementation with Transactional Synchronization Extensions such as _xbegin() and _xend(). The hardware is supposed to support lock elision as hle CPU flag is for Hardware Lock Elision I think. The processor I am using is Intel(R) Xeon(R) Gold 6130 with Skylake architecture. First I wanted to disable Lock elision but when I run the program that uses pthread_mutex_lock , with perf stat -T to monitor

Enable/Disable Hardware Lock Elision

杀马特。学长 韩版系。学妹 提交于 2021-02-07 13:33:24
问题 I am using glibc 2.24 version. It has lock elision path included for pthread_mutex_lock implementation with Transactional Synchronization Extensions such as _xbegin() and _xend(). The hardware is supposed to support lock elision as hle CPU flag is for Hardware Lock Elision I think. The processor I am using is Intel(R) Xeon(R) Gold 6130 with Skylake architecture. First I wanted to disable Lock elision but when I run the program that uses pthread_mutex_lock , with perf stat -T to monitor

AVX/SSE round floats down and return vector of ints?

拟墨画扇 提交于 2021-02-07 08:20:53
问题 Is there a way using AVX/SSE to take a vector of floats, round-down and produce a vector of ints? All the floor intrinsic methods seem to produce a final vector of floating point, which is odd because rounding produces an integer! 回答1: SSE has conversion from FP to integer with your choice of truncation (towards zero) or the current rounding mode (normally the IEEE default mode, nearest with tiebreaks rounding to even. Like nearbyint() , unlike round() where the tiebreak is away-from-0. If

AVX/SSE round floats down and return vector of ints?

▼魔方 西西 提交于 2021-02-07 08:17:27
问题 Is there a way using AVX/SSE to take a vector of floats, round-down and produce a vector of ints? All the floor intrinsic methods seem to produce a final vector of floating point, which is odd because rounding produces an integer! 回答1: SSE has conversion from FP to integer with your choice of truncation (towards zero) or the current rounding mode (normally the IEEE default mode, nearest with tiebreaks rounding to even. Like nearbyint() , unlike round() where the tiebreak is away-from-0. If

Matching the intel codes to disassembly output

試著忘記壹切 提交于 2021-02-05 08:18:52
问题 I'm starting to use the Intel reference page to look up and learn about the op codes (instead of asking everything on SO). I'd like to make sure that my understanding is OK and ask a few questions on the output between a basic asm program and the intel instruction codes. Here is the program I have to compare various mov instructions into the rax -ish register (is there a better way to say "rax" and its 32- 16- and 8- bit components?): .globl _start _start: movq $1, %rax # move immediate into

Matching the intel codes to disassembly output

大兔子大兔子 提交于 2021-02-05 08:18:49
问题 I'm starting to use the Intel reference page to look up and learn about the op codes (instead of asking everything on SO). I'd like to make sure that my understanding is OK and ask a few questions on the output between a basic asm program and the intel instruction codes. Here is the program I have to compare various mov instructions into the rax -ish register (is there a better way to say "rax" and its 32- 16- and 8- bit components?): .globl _start _start: movq $1, %rax # move immediate into

Would having the call stack grow upward make buffer overruns safer?

亡梦爱人 提交于 2021-02-05 04:56:16
问题 Each thread has its own stack to store local variables. But stacks are also used to store return addresses when calling a function. In x86 assembly, esp points to the most-recently allocated end of the stack. Today, most CPUs have stack grow negatively. This behavior enables arbitrary code execution by overflowing the buffer and overwriting the saved return address. If the stack was to grow positively, such attacks would not be feasible. Is it safer to have the call stack grow upwards? Why

Would having the call stack grow upward make buffer overruns safer?

情到浓时终转凉″ 提交于 2021-02-05 04:54:06
问题 Each thread has its own stack to store local variables. But stacks are also used to store return addresses when calling a function. In x86 assembly, esp points to the most-recently allocated end of the stack. Today, most CPUs have stack grow negatively. This behavior enables arbitrary code execution by overflowing the buffer and overwriting the saved return address. If the stack was to grow positively, such attacks would not be feasible. Is it safer to have the call stack grow upwards? Why