simd | 易学教程

What is the avx2 instruction to store 8 integers?

阅读更多关于 What is the avx2 instruction to store 8 integers?

问题 I want to store the 8 integers from a __m256i variable to an array of 8 x 32 bit int s. I thought the instruction for that would be _mm256_store_epi32 , but I get an error that this instruction doesn't even exist! 回答1: Have a look at the Intel Intrinsics Guide. Depending on whether your destination is aligned, you need _mm256_store_si256 or _mm256_storeu_si256. 来源： https://stackoverflow.com/questions/43304021/what-is-the-avx2-instruction-to-store-8-integers

Why floating point registers are different than general purpose ones

阅读更多关于 Why floating point registers are different than general purpose ones

问题 Most architectures have different set of registers for storing regular integers and floating points. From a binary storage point of view, it shouldn't matter where things are stored right? it's just 1's and 0's, couldn't they pipe the same general purpose registers into floating point ALUs? SIMD ( xmm in x64) registers are capable of storing both Floating point and regular integers, so why doesn't the same concept apply to regular registers? 回答1: For practical processor design, there are a

How to simulate pcmpgtq on sse2?

阅读更多关于 How to simulate pcmpgtq on sse2?

问题 PCMPGTQ was introduced in sse4.2, and it provides a greater than signed comparison for 64 bit numbers that yields a mask. How does one support this functionality on instructions sets predating sse4.2? Update: This same question applies to ARMv7 with Neon which also lacks a 64-bit comparator. The sister question to this is found here: What is the most efficient way to support CMGT with 64bit signed comparisons on ARMv7a with Neon? 回答1: __m128i pcmpgtq_sse2 (__m128i a, __m128i b) { __m128i r =

How to simulate pcmpgtq on sse2?

阅读更多关于 How to simulate pcmpgtq on sse2?

How to simulate pcmpgtq on sse2?

阅读更多关于 How to simulate pcmpgtq on sse2?

How to simulate pcmpgtq on sse2?

阅读更多关于 How to simulate pcmpgtq on sse2?

What are the 128-bit to 512-bit registers used for?

阅读更多关于 What are the 128-bit to 512-bit registers used for?

问题 After looking at a table of registers in the x86/x64 architecture, I noticed that there's a whole section of 128, 256, and 512-bit registers that I've never seen them being used in assembly, or decompiled C/C++ code: XMM(0-15) for 128, YMM(0-15) for 256, ZMM(0-31) 512. After doing a bit of digging what I've gathered is that you have to use 2 64 bit operations in order to perform math on a 128 bit number, instead of using generic add , sub , mul , div operations. If this is the case, then what

What are the 128-bit to 512-bit registers used for?

阅读更多关于 What are the 128-bit to 512-bit registers used for?

What are the 128-bit to 512-bit registers used for?

阅读更多关于 What are the 128-bit to 512-bit registers used for?

What are the 128-bit to 512-bit registers used for?

阅读更多关于 What are the 128-bit to 512-bit registers used for?