I would like to implement a simple substitution cipher to mask private ids in URLs.
I know how my IDs will look like (combination of uppercase ASCII letters, digits
It's not a substitution cipher at all, but your question is clear enough.
Have a look at Base85: http://en.wikipedia.org/wiki/Ascii85
For Java (as indirectly linked by the Wikipedia article):
I now have a working solution which you can find here:
http://pastebin.com/Mctnidng
The problem was that a) I was losing precision in long codes through this part:
value = value.add(//
BigInteger.valueOf((long) Math.pow(alphabet.length, i)) // error here
.multiply(
BigInteger.valueOf(ArrayUtils.indexOf(alphabet, c))));
(long just wasn't long enough)
and b) whenever I had a text that started with the character at offset 0 in the alphabet, this would be dropped, so I needed to add a length character (a single character will do fine here, as my codes will never be as long as the alphabet)
Inexplicably Character.MAX_RADIX is only 36, but you can always write your own base conversion routine. The following implementation isn't high-performance, but it should be a good starting point:
import java.math.BigInteger;
public class BaseConvert {
static BigInteger fromString(String s, int base, String symbols) {
BigInteger num = BigInteger.ZERO;
BigInteger biBase = BigInteger.valueOf(base);
for (char ch : s.toCharArray()) {
num = num.multiply(biBase)
.add(BigInteger.valueOf(symbols.indexOf(ch)));
}
return num;
}
static String toString(BigInteger num, int base, String symbols) {
StringBuilder sb = new StringBuilder();
BigInteger biBase = BigInteger.valueOf(base);
while (!num.equals(BigInteger.ZERO)) {
sb.append(symbols.charAt(num.mod(biBase).intValue()));
num = num.divide(biBase);
}
return sb.reverse().toString();
}
static String span(char from, char to) {
StringBuilder sb = new StringBuilder();
for (char ch = from; ch <= to; ch++) {
sb.append(ch);
}
return sb.toString();
}
}
Then you can have a main() test harness like the following:
public static void main(String[] args) {
final String SYMBOLS_AZ09_ = span('A','Z') + span('0','9') + "_";
final String SYMBOLS_09AZ = span('0','9') + span('A','Z');
final String SYMBOLS_AZaz09 = span('A','Z') + span('a','z') + span('0','9');
BigInteger n = fromString("GFZHFFFZFZTFZTF_24_F34", 37, SYMBOLS_AZ09_);
// let's convert back to base 37 first...
System.out.println(toString(n, 37, SYMBOLS_AZ09_));
// prints "GFZHFFFZFZTFZTF_24_F34"
// now let's see what it looks like in base 62...
System.out.println(toString(n, 62, SYMBOLS_AZaz09));
// prints "ctJvrR5kII1vdHKvjA4"
// now let's test with something we're more familiar with...
System.out.println(fromString("CAFEBABE", 16, SYMBOLS_09AZ));
// prints "3405691582"
n = BigInteger.valueOf(3405691582L);
System.out.println(toString(n, 16, SYMBOLS_09AZ));
// prints "CAFEBABE"
}
BigInteger is probably easiest if the numbers can exceed longchar in the symbol String, just stick to one "secret" permutationYou can't generally expect the base 62 string to be around half as short as the base 36 string. Here's Long.MAX_VALUE in base 10, 20, and 30:
System.out.format("%s%n%s%n%s%n",
Long.toString(Long.MAX_VALUE, 10), // "9223372036854775807"
Long.toString(Long.MAX_VALUE, 20), // "5cbfjia3fh26ja7"
Long.toString(Long.MAX_VALUE, 30) // "hajppbc1fc207"
);