Efficient Conversion of a Binary Number to Hexadecimal String [closed]

血红的双手。 提交于 2019-12-13 22:37:52

问题


I am writing a program that converts a binary value's hexadecimal representation to a regular string. So each character in the hex representation would convert to two hexadecimal characters in the string. This means the result will be twice the size; a hexadecimal representation of 1 byte would need two bytes in a string.

Hexadecimal Characters

0123456789                    ;0x30 - 0x39
ABCDEF                        ;0x41 - 0x46

Example

0xF05C1E3A                    ;hex
4032568890                    ;dec

would become

0x4630354331453341            ;hex
5057600944242766657           ;dec

Question?

Are there any elegant/alternative(/interesting) methods for converting between these states, other than a lookup table, (bitwise operations, shifts, modulo, etc)? I'm not looking for a function in a library, but rather how one would/should be implemented. Any ideas?


回答1:


Here's a solution with nothing but shifts, and/or, and add/subtract. No loops either.

uint64_t x, m;
x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8)  | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4)  | (x & 0x000f000f000f000fLL);
x += 0x0606060606060606LL;
m = ((x & 0x1010101010101010LL) >> 4) + 0x7f7f7f7f7f7f7f7fLL;
x += (m & 0x2a2a2a2a2a2a2a2aLL) | (~m & 0x3131313131313131LL);

Above is the simplified version I came up with after a little time to reflect. Below is the original answer.

uint64_t x, m;
x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8) | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4) | (x & 0x000f000f000f000fLL);
x += 0x3636363636363636LL;
m = (x & 0x4040404040404040LL) >> 6;
x += m;
m = m ^ 0x0101010101010101LL;
x -= (m << 2) | (m << 1);

See it in action: http://ideone.com/nMhJ2q




回答2:


Spreading out the nibbles to bytes is easy with pdep:

spread = _pdep_u64(raw, 0x0F0F0F0F0F0F0F0F);

Now we'd have to add 0x30 to bytes in the range 0-9 and 0x41 to higher bytes. This could be done by SWAR-subtracting 10 from every byte and then using the sign to select which number to add, such as (not tested)

H = 0x8080808080808080;
ten = 0x0A0A0A0A0A0A0A0A
cmp = ((spread | H) - (ten &~H)) ^ ((spread ^~ten) & H); // SWAR subtract
masks = ((cmp & H) >> 7) * 255;
// if x-10 is negative, take 0x30, else 0x41
add = (masks & 0x3030303030303030) | (~masks & 0x3737373737373737);
asString = spread + add;

That SWAR compare can probably be optimized since you shouldn't need a full subtract to implement it.

There are some different suggestions here, including SIMD: http://wm.ite.pl/articles/convert-to-hex.html




回答3:


A slightly simpler version based on Mark Ransom's:

uint64_t x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8)  | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4)  | (x & 0x000f000f000f000fLL);
x =  (x + 0x3030303030303030LL) +
   (((x + 0x0606060606060606LL) & 0x1010101010101010LL) >> 4) * 7;

And if you want to avoid the multiplication:

uint64_t m, x = 0xF05C1E3A;
x = ((x & 0x00000000ffff0000LL) << 16) | (x & 0x000000000000ffffLL);
x = ((x & 0x0000ff000000ff00LL) << 8)  | (x & 0x000000ff000000ffLL);
x = ((x & 0x00f000f000f000f0LL) << 4)  | (x & 0x000f000f000f000fLL);
m =  (x + 0x0606060606060606LL) & 0x1010101010101010LL;
x =  (x + 0x3030303030303030LL) + (m >> 1) - (m >> 4);



回答4:


A bit more decent conversion from the the integer to the string any base from 2 to length of the digits

char *reverse(char *);

const char digits[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char *convert(long long number, char *buff, int base)
{
    char *result = (buff == NULL || base > strlen(digits) || base < 2) ? NULL : buff;
    char sign = 0;

    if (number < 0)
    {
         sign = '-';
        number = -number;
    }
    if (result != NULL)
    {
        do
        {
            *buff++ = digits[number % base];
            number /= base;
        } while (number);
        if(sign) *buff++ = sign;
        *buff = 0;
        reverse(result);
    }
    return result;
}


char *reverse(char *str)
{
    char tmp;
    int len;

    if (str != NULL)
    {
        len = strlen(str);
        for (int i = 0; i < len / 2; i++)
        {
            tmp = *(str + i);
            *(str + i) = *(str + len - i - 1);
            *(str + len - i - 1) = tmp;

        }
    }
    return str;
}

example - counting from -50 to 50 decimal in base 23

-24     -23     -22     -21     -20     -1M     -1L     -1K     -1J     -1I     -1H     -1G     -1F     -1E     -1D
-1C     -1B     -1A     -19     -18     -17     -16     -15     -14     -13     -12     -11     -10     -M      -L
-K      -J      -I      -H      -G      -F      -E      -D      -C      -B      -A      -9      -8      -7      -6
-5      -4      -3      -2      -1      0       1       2       3       4       5       6       7       8       9
A       B       C       D       E       F       G       H       I       J       K       L       M       10      11
12      13      14      15      16      17      18      19      1A      1B      1C      1D      1E      1F      1G
1H      1I      1J      1K      1L      1M      20      21      22      23      24



回答5:


A LUT (lookup table) C++ variant. I didn't check the actual machine code produced, but I believe any modern C++ compiler can catch the idea and compile it well.

static const char nibble2hexChar[] { "0123456789ABCDEF" };
     // 17B in total, because I'm lazy to init it per char

void byteToHex(std::ostream & out, const uint8_t value) {
    out << nibble2hexChar[value>>4] << nibble2hexChar[value&0xF];
}

// this one is actually written more toward short+simple source, than performance
void dwordToHex(std::ostream & out, uint32_t value) {
    int i = 8;
    while (i--) {
        out << nibble2hexChar[value>>28];
        value <<= 4;
    }
}

EDIT: For C code you have just to switch from std::ostream to some other output means, unfortunately your question lacks any details, what you are actually trying to achieve and why you don't use the built-in printf family of C functions.

For example C like this can write to some char* output buffer, converting arbitrary amount of bytes:

/**
 * Writes hexadecimally formatted "n" bytes array "values" into "outputBuffer".
 * Make sure there's enough space in output buffer allocated, and add zero
 * terminator yourself, if you plan to use it as C-string.
 * 
 * @Returns: pointer after the last character written.
 */
char* dataToHex(char* outputBuffer, const size_t n, const unsigned char* values) {
    for (size_t i = 0; i < n; ++i) {
        *outputBuffer++ = nibble2hexChar[values[i]>>4];
        *outputBuffer++ = nibble2hexChar[values[i]&0xF];
    }
    return outputBuffer;
}

And finally, I did help once somebody on code review, as he had performance bottleneck exactly with hexadecimal formatting, but I did there the code variant conversion, without LUT, also the whole process and other answer + performance measuring may be instructional for you, as you may see that the fastest solution doesn't just blindly convert result, but actually mix up with the main operation, to achieve better performance overall. So that's why I'm wonder what you are trying to solve, as the whole problem may often allow for more optimal solution, if you just ask about conversion, printf("%x",..) is safe bet.

Here is that another approach for "to hex" conversion: fast C++ XOR Function




回答6:


  1. Decimal -> Hex

Just iterate throught string and every character convert to int, then you can do

printf("%02x", c);

or use sprintf for saving to another variable

  1. Hex -> Decimal

Code

printf("%c",16 * hexToInt('F') + hexToInt('0'));


int hexToInt(char c)
{
    if(c >= 'a' && c <= 'z')
        c = c - ('a' - 'A');

    int sum;

    sum = c / 16 - 3;
    sum *= 10;
    sum += c % 16;

    return (sum > 9) ? sum - 1 : sum;
}



回答7:


The articles below compare different methods of converting digits to string, hex numbers are not covered but it seems not a big problem to switch from dec to hex

Integers

Fixed and floating point

@EDIT Thank you for pointing that the answer above is not relevant. Common way with no LUT is to split integer into nibbles and map them to ASCII

#include <stdio.h>
#include <stdint.h>
#include <string.h>

#define HI_NIBBLE(b) (((b) >> 4) & 0x0F)
#define LO_NIBBLE(b) ((b) & 0x0F)

void int64_to_char(char carr[], int64_t val){
    memcpy(carr, &val, 8);
}

uint64_t inp = 0xF05C1E3A;
char tmp_st[8];

int main()
{
    int64_to_char(tmp_st,inp);
    printf("Sample: %x\n", inp);
    printf("Result: 0x");
    for (unsigned int k = 8; k; k--){
        char tmp_ch = *(tmp_st+k-1);
        char hi_nib = HI_NIBBLE(tmp_ch);
        char lo_nib = LO_NIBBLE(tmp_ch);
        if (hi_nib || lo_nib){
            printf("%c%c",hi_nib+((hi_nib>9)?55:48),lo_nib+((lo_nib>9)?55:48));
        }
     }
     printf("\n");
    return 0;
}

Another way is to use Allison's Algorithm. I am total noob in ASM, so I post the code in the form I googled it.

Variant 1:

ADD AL,90h
DAA
ADC AL,40h
DAA

Variant 2:

CMP  AL, 0Ah
SBB  AL, 69h
DAS


来源:https://stackoverflow.com/questions/45598583/efficient-conversion-of-a-binary-number-to-hexadecimal-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!