How would you transpose a binary matrix?

前端 未结 7 985
-上瘾入骨i
-上瘾入骨i 2020-12-16 02:13

I have binary matrices in C++ that I repesent with a vector of 8-bit values.

For example, the following matrix:

1 0 1 0 1 0 1
0 1 1 0 0 1 1
0 0 0 1 1         


        
7条回答
  •  时光取名叫无心
    2020-12-16 02:58

    Here's what I posted on gitub (mischasan/sse2/ssebmx.src) Changing INP() and OUT() to use induction vars saves an IMUL each. AVX256 does it twice as fast. AVX512 is not an option, because there is no _mm512_movemask_epi8().

    #include 
    #include 
    
    #define INP(x,y) inp[(x)*ncols/8 + (y)/8]
    #define OUT(x,y) out[(y)*nrows/8 + (x)/8]
    
    void ssebmx(char const *inp, char *out, int nrows, int ncols)
    {
        int rr, cc, i, h;
        union { __m128i x; uint8_t b[16]; } tmp;
    
        // Do the main body in [16 x 8] blocks:
        for (rr = 0; rr <= nrows - 16; rr += 16)
            for (cc = 0; cc < ncols; cc += 8) {
                for (i = 0; i < 16; ++i)
                    tmp.b[i] = INP(rr + i, cc);
                for (i = 8; i--; tmp.x = _mm_slli_epi64(tmp.x, 1))
                    *(uint16_t*)&OUT(rr, cc + i) = _mm_movemask_epi8(tmp.x);
            }
    
        if (rr == nrows) return;
    
        // The remainder is a row of [8 x 16]* [8 x 8]?
    
        //  Do the [8 x 16] blocks:
        for (cc = 0; cc <= ncols - 16; cc += 16) {
            for (i = 8; i--;)
                tmp.b[i] = h = *(uint16_t const*)&INP(rr + i, cc),
                tmp.b[i + 8] = h >> 8;
            for (i = 8; i--; tmp.x = _mm_slli_epi64(tmp.x, 1))
                OUT(rr, cc + i) = h = _mm_movemask_epi8(tmp.x),
                OUT(rr, cc + i + 8) = h >> 8;
        }
    
        if (cc == ncols) return;
    
        //  Do the remaining [8 x 8] block:
        for (i = 8; i--;)
            tmp.b[i] = INP(rr + i, cc);
        for (i = 8; i--; tmp.x = _mm_slli_epi64(tmp.x, 1))
            OUT(rr, cc + i) = _mm_movemask_epi8(tmp.x);
    }
    

    HTH.

提交回复
热议问题