Below is my current char* to hex string function. I wrote it as an exercise in bit manipulation. It takes ~7ms on a AMD Athlon MP 2800+ to hexify a 10 million byte array. Is
I'm not sure doing it more bytes at a time will be better... you'll probably just get tons of cache misses and slow it down significantly.
What you might try is to unroll the loop though, take larger steps and do more characters each time through the loop, to remove some of the loop overhead.