substitution cipher with different alphabet length

前端 未结 3 2078
-上瘾入骨i
-上瘾入骨i 2020-12-20 18:33

I would like to implement a simple substitution cipher to mask private ids in URLs.

I know how my IDs will look like (combination of uppercase ASCII letters, digits

相关标签:
3条回答
  • 2020-12-20 18:56

    It's not a substitution cipher at all, but your question is clear enough.

    Have a look at Base85: http://en.wikipedia.org/wiki/Ascii85

    For Java (as indirectly linked by the Wikipedia article):

    • http://java.freehep.org/freehep-io/apidocs/org/freehep/util/io/ASCII85InputStream.html
    • http://java.freehep.org/freehep-io/apidocs/org/freehep/util/io/ASCII85OutputStream.html
    0 讨论(0)
  • 2020-12-20 19:03

    I now have a working solution which you can find here:

    http://pastebin.com/Mctnidng

    The problem was that a) I was losing precision in long codes through this part:

    value = value.add(//
        BigInteger.valueOf((long) Math.pow(alphabet.length, i)) // error here
            .multiply(
                BigInteger.valueOf(ArrayUtils.indexOf(alphabet, c))));
    

    (long just wasn't long enough)

    and b) whenever I had a text that started with the character at offset 0 in the alphabet, this would be dropped, so I needed to add a length character (a single character will do fine here, as my codes will never be as long as the alphabet)

    0 讨论(0)
  • 2020-12-20 19:07

    Inexplicably Character.MAX_RADIX is only 36, but you can always write your own base conversion routine. The following implementation isn't high-performance, but it should be a good starting point:

    import java.math.BigInteger;
    public class BaseConvert {
        static BigInteger fromString(String s, int base, String symbols) {
            BigInteger num = BigInteger.ZERO;
            BigInteger biBase = BigInteger.valueOf(base);
            for (char ch : s.toCharArray()) {
                num = num.multiply(biBase)
                         .add(BigInteger.valueOf(symbols.indexOf(ch)));
            }
            return num;
        }
        static String toString(BigInteger num, int base, String symbols) {
            StringBuilder sb = new StringBuilder();
            BigInteger biBase = BigInteger.valueOf(base);
            while (!num.equals(BigInteger.ZERO)) {
                sb.append(symbols.charAt(num.mod(biBase).intValue()));
                num = num.divide(biBase);
            }
            return sb.reverse().toString();
        }
        static String span(char from, char to) {
            StringBuilder sb = new StringBuilder();
            for (char ch = from; ch <= to; ch++) {
                sb.append(ch);
            }
            return sb.toString();
        }
    }
    

    Then you can have a main() test harness like the following:

    public static void main(String[] args) {
        final String SYMBOLS_AZ09_ = span('A','Z') + span('0','9') + "_";
        final String SYMBOLS_09AZ = span('0','9') + span('A','Z');
        final String SYMBOLS_AZaz09 = span('A','Z') + span('a','z') + span('0','9');
    
        BigInteger n = fromString("GFZHFFFZFZTFZTF_24_F34", 37, SYMBOLS_AZ09_);
    
        // let's convert back to base 37 first...
        System.out.println(toString(n, 37, SYMBOLS_AZ09_));
        // prints "GFZHFFFZFZTFZTF_24_F34"
    
        // now let's see what it looks like in base 62...       
        System.out.println(toString(n, 62, SYMBOLS_AZaz09));
        // prints "ctJvrR5kII1vdHKvjA4"
    
        // now let's test with something we're more familiar with...
        System.out.println(fromString("CAFEBABE", 16, SYMBOLS_09AZ));
        // prints "3405691582"
    
        n = BigInteger.valueOf(3405691582L);
        System.out.println(toString(n, 16, SYMBOLS_09AZ));
        // prints "CAFEBABE"        
    }
    

    Some observations

    • BigInteger is probably easiest if the numbers can exceed long
    • You can shuffle the char in the symbol String, just stick to one "secret" permutation

    Note regarding "50% compression"

    You can't generally expect the base 62 string to be around half as short as the base 36 string. Here's Long.MAX_VALUE in base 10, 20, and 30:

        System.out.format("%s%n%s%n%s%n",
            Long.toString(Long.MAX_VALUE, 10), // "9223372036854775807"
            Long.toString(Long.MAX_VALUE, 20), // "5cbfjia3fh26ja7"
            Long.toString(Long.MAX_VALUE, 30)  // "hajppbc1fc207"
        );
    
    0 讨论(0)
提交回复
热议问题