char[] to hex string exercise

前端 未结 16 895
感情败类
感情败类 2021-01-12 19:14

Below is my current char* to hex string function. I wrote it as an exercise in bit manipulation. It takes ~7ms on a AMD Athlon MP 2800+ to hexify a 10 million byte array. Is

16条回答
  •  情书的邮戳
    2021-01-12 19:54

    This assembly function (based off my previous post here, but I had to modify the concept a bit to get it to actually work) processes 3.3 billion input characters per second (6.6 billion output characters) on one core of a Core 2 Conroe 3Ghz. Penryn is probably faster.

    %include "x86inc.asm"
    
    SECTION_RODATA
    pb_f0: times 16 db 0xf0
    pb_0f: times 16 db 0x0f
    pb_hex: db 48,49,50,51,52,53,54,55,56,57,65,66,67,68,69,70
    
    SECTION .text
    
    ; int convert_string_to_hex( char *input, char *output, int len )
    
    cglobal _convert_string_to_hex,3,3
        movdqa xmm6, [pb_f0 GLOBAL]
        movdqa xmm7, [pb_0f GLOBAL]
    .loop:
        movdqa xmm5, [pb_hex GLOBAL]
        movdqa xmm4, [pb_hex GLOBAL]
        movq   xmm0, [r0+r2-8]
        movq   xmm2, [r0+r2-16]
        movq   xmm1, xmm0
        movq   xmm3, xmm2
        pand   xmm0, xmm6 ;high bits
        pand   xmm2, xmm6
        psrlq  xmm0, 4
        psrlq  xmm2, 4
        pand   xmm1, xmm7 ;low bits
        pand   xmm3, xmm7
        punpcklbw xmm0, xmm1
        punpcklbw xmm2, xmm3
        pshufb xmm4, xmm0
        pshufb xmm5, xmm2
        movdqa [r1+r2*2-16], xmm4
        movdqa [r1+r2*2-32], xmm5
        sub r2, 16
        jg .loop
        REP_RET
    

    Note it uses x264 assembly syntax, which makes it more portable (to 32-bit vs 64-bit, etc). To convert this into the syntax of your choice is trivial: r0, r1, r2 are the three arguments to the functions in registers. Its a bit like pseudocode. Or you can just get common/x86/x86inc.asm from the x264 tree and include that to run it natively.

    P.S. Stack Overflow, am I wrong for wasting time on such a trivial thing? Or is this awesome?

提交回复
热议问题