c++ SSE SIMD framework [closed]

浪子不回头ぞ 提交于 2019-12-20 08:40:57

问题


Does anyone know an open-source C++ x86 SIMD intrinsics library?

Intel supplies exactly what I need in their integrated performance primitives library, but I can't use that because of the copyrights all over the place.

EDIT

I already know the intrinsics provided by the compilers. What I need is a convenient interface to use them.


回答1:


Take a look at libsimdpp header-only C++ SIMD wrapper library.

The library supports several instruction sets via single interface: SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, AVX512F, XOP, FMA3/4, NEON, NEONv2, Altivec. All of Clang, GCC, MSVC and ICC are suported.

Any differences between instruction sets are resolved by implementing the missing instructions as a combination of supported ones. As a bonus, it's possible to compile the same code for several instruction sets, link the resulting object files to a single executable and use a convenient dynamic dispatch mechanism to run the implementation most tailored to the current processor.




回答2:


There are several libraries that have emerged in recent years to abstract explicit SIMD programming. The most important ones:

  • Vc
  • boost::simd (not actually in boost - part of NT²)
  • Prof. Agner Fog's Vectorclass library

The most important thing to look for is to have a usable set of types that correctly abstract the best available SIMD registers and instructions for a given target. And, obviously, full portability to systems without SIMD support.




回答3:


I wrote a GLSL-style library that will convert to near-perfect quality ASM code.

A very common operation - cross product:

vec4 cross(const vec4 &a, const vec4 &b)
{
    return a.yzxw * b.zxyw - a.zxyw * b.yzxw;
}

would be converted to this assemly code using glsl-sse2:

_Z5crossRK4vec4S1_:
    movaps    (%rsi), %xmm1
    movaps    (%rdx), %xmm2
    pshufd    $201, %xmm1, %xmm5
    pshufd    $210, %xmm2, %xmm0
    pshufd    $210, %xmm1, %xmm4
    pshufd    $201, %xmm2, %xmm3
    mulps     %xmm0, %xmm5
    mulps     %xmm3, %xmm4
    subps     %xmm4, %xmm5
    movaps    %xmm5, (%rdi)
    ret

Please note the library isn't perfect yet, and most likely have unfound bugs as it is still new.




回答4:


Have a look at AMD's SSEPlus project, might be what your after




回答5:


Microsoft has just released its new "DirectXMath" library. It includes support for SSE2 and NEON intrinsics. Documentation looks decent too.

The DirectXMath API provides SIMD-friendly C++ types and functions for common linear algebra and graphics math operations common to DirectX applications. The library provides optimized versions for Windows 32-bit (x86), Windows 64-bit (x64), and Windows on ARM through SSE2 and ARM-NEON intrinsics support in the Visual Studio compiler.




回答6:


Vc is another C++ library that implements vector classes and allows writing vectorized code that is independent from the actual instruction set that is used.




回答7:


You might want to look at macstl - although it was originally developed for the Mac (and PowerPC) it now works on Linux and x86 too.

Also, if you're working with images then look at OpenCV - this has SSE-optimised routines for many common image processing tasks and has C and C++ APIs.




回答8:


Which compiler? Visual Studio C++ compiler supports a set of SIMD, SIMD2 and MMX intrinsic functions.



来源:https://stackoverflow.com/questions/4953121/c-sse-simd-framework

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!