c++ SSE SIMD framework [closed]

柔情痞子 提交于 2019-12-02 15:15:41
p12

Take a look at libsimdpp header-only C++ SIMD wrapper library.

The library supports several instruction sets via single interface: SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, AVX512F, XOP, FMA3/4, NEON, NEONv2, Altivec. All of Clang, GCC, MSVC and ICC are suported.

Any differences between instruction sets are resolved by implementing the missing instructions as a combination of supported ones. As a bonus, it's possible to compile the same code for several instruction sets, link the resulting object files to a single executable and use a convenient dynamic dispatch mechanism to run the implementation most tailored to the current processor.

There are several libraries that have emerged in recent years to abstract explicit SIMD programming. The most important ones:

The most important thing to look for is to have a usable set of types that correctly abstract the best available SIMD registers and instructions for a given target. And, obviously, full portability to systems without SIMD support.

I wrote a GLSL-style library that will convert to near-perfect quality ASM code.

A very common operation - cross product:

vec4 cross(const vec4 &a, const vec4 &b)
{
    return a.yzxw * b.zxyw - a.zxyw * b.yzxw;
}

would be converted to this assemly code using glsl-sse2:

_Z5crossRK4vec4S1_:
    movaps    (%rsi), %xmm1
    movaps    (%rdx), %xmm2
    pshufd    $201, %xmm1, %xmm5
    pshufd    $210, %xmm2, %xmm0
    pshufd    $210, %xmm1, %xmm4
    pshufd    $201, %xmm2, %xmm3
    mulps     %xmm0, %xmm5
    mulps     %xmm3, %xmm4
    subps     %xmm4, %xmm5
    movaps    %xmm5, (%rdi)
    ret

Please note the library isn't perfect yet, and most likely have unfound bugs as it is still new.

Have a look at AMD's SSEPlus project, might be what your after

Microsoft has just released its new "DirectXMath" library. It includes support for SSE2 and NEON intrinsics. Documentation looks decent too.

The DirectXMath API provides SIMD-friendly C++ types and functions for common linear algebra and graphics math operations common to DirectX applications. The library provides optimized versions for Windows 32-bit (x86), Windows 64-bit (x64), and Windows on ARM through SSE2 and ARM-NEON intrinsics support in the Visual Studio compiler.

Vc is another C++ library that implements vector classes and allows writing vectorized code that is independent from the actual instruction set that is used.

You might want to look at macstl - although it was originally developed for the Mac (and PowerPC) it now works on Linux and x86 too.

Also, if you're working with images then look at OpenCV - this has SSE-optimised routines for many common image processing tasks and has C and C++ APIs.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!