What is the most efficient way to encode an arbitrary GUID into readable ASCII (33-127)?

后端 未结 8 673
自闭症患者
自闭症患者 2020-11-30 22:08

The standard string representation of GUID takes about 36 characters. Which is very nice, but also really wasteful. I am wondering, how to encode it in the shortest possible

8条回答
  •  暖寄归人
    2020-11-30 22:21

    You have 95 characters available -- so, more than 6 bits, but not quite as many as 7 (about 6.57 actually). You could use 128/log2(95) = about 19.48 characters, to encode into 20 characters. If saving 2 characters in the encoded form is worth the loss of readability to you, something like (pseudocode):

    char encoded[21];
    long long guid;    // 128 bits number
    
    for(int i=0; i<20; ++i) {
      encoded[i] = chr(guid % 95 + 33);
      guid /= 95;
    }
    encoded[20] = chr(0);
    

    which is basically the generic "encode a number in some base" code, except that there's no need to reverse the "digits" since the order's arbitrary anyway (and little-endian is more direct and natural). To get back the guid from the encoded string is, in a very similar way, the polynomial computation in base 95 (after subtracting 33 from each digit of course):

    guid = 0;
    
    for(int i=0; i<20; ++i) {
      guid *= 95;
      guid += ord(encoded[i]) - 33;
    }
    

    essentially using Horner's approach to polynomial evaluation.

提交回复
热议问题