Map incrementing integer range to six-digit base 26 max, but unpredictably

前端 未结 8 1387
时光说笑
时光说笑 2020-12-04 22:39

I want to design a URL shortener for a particular use case and type of end-user that I have targetted. I have decided that I want the URLs to be stored internally according

8条回答
  •  天命终不由人
    2020-12-04 23:15

    Why not just shuffle the bits around in a specific order before converting to the base 26 value? For example, bit 0 becomes bit 5, bit 1 becomes bit 2, etc. To decode, just do the reverse.

    Here's an example in Python. (Edited now to include converting the base too.)

    import random
    
    # generate a random bit order
    # you'll need to save this mapping permanently, perhaps just hardcode it
    # map how ever many bits you need to represent your integer space
    mapping = range(28)
    mapping.reverse()
    #random.shuffle(mapping)
    
    # alphabet for changing from base 10
    chars = 'abcdefghijklmnopqrstuvwxyz'
    
    # shuffle the bits
    def encode(n):
        result = 0
        for i, b in enumerate(mapping):
            b1 = 1 << i
            b2 = 1 << mapping[i]
            if n & b1:
                result |= b2
        return result
    
    # unshuffle the bits
    def decode(n):
        result = 0
        for i, b in enumerate(mapping):
            b1 = 1 << i
            b2 = 1 << mapping[i]
            if n & b2:
                result |= b1
        return result
    
    # change the base
    def enbase(x):
        n = len(chars)
        if x < n:
            return chars[x]
        return enbase(x/n) + chars[x%n]
    
    # go back to base 10
    def debase(x):
        n = len(chars)
        result = 0
        for i, c in enumerate(reversed(x)):
            result += chars.index(c) * (n**i)
        return result
    
    # test it out
    for a in range(200):
        b = encode(a)
        c = enbase(b)
        d = debase(c)
        e = decode(d)
        while len(c) < 7:
            c = ' ' + c
        print '%6d %6d %s %6d %6d' % (a, b, c, d, e)
    

    The output of this script, showing the encoding and decoding process:

       0            0       a            0    0
       1    134217728  lhskyi    134217728    1
       2     67108864  fqwfme     67108864    2
       3    201326592  qyoqkm    201326592    3
       4     33554432  cvlctc     33554432    4
       5    167772160  oddnrk    167772160    5
       6    100663296  imhifg    100663296    6
       7    234881024  ttztdo    234881024    7
       8     16777216  bksojo     16777216    8
       9    150994944  mskzhw    150994944    9
      10     83886080  hbotvs     83886080   10
      11    218103808  sjheua    218103808   11
      12     50331648  egdrcq     50331648   12
      13    184549376  pnwcay    184549376   13
      14    117440512  jwzwou    117440512   14
      15    251658240  veshnc    251658240   15
      16      8388608   sjheu      8388608   16
      17    142606336  mabsdc    142606336   17
      18     75497472  gjfmqy     75497472   18
      19    209715200  rqxxpg    209715200   19
    

    Note that zero maps to zero, but you can just skip that number.

    This is simple, efficient and should be good enough for your purposes. If you really needed something secure I obviously would not recommend this. It's basically a naive block cipher. There won't be any collisions.

    Probably best to make sure that bit N doesn't ever map to bit N (no change) and probably best if some low bits in the input get mapped to higher bits in the output, in general. In other words, you may want to generate the mapping by hand. In fact, a decent mapping would be simply reversing the bit order. (That's what I did for the sample output above.)

提交回复
热议问题