I\'m going to implement a tokenizer in Python and I was wondering if you could offer some style advice?
I\'ve implemented a tokenizer before in C and in Java so I\'m
"Is there a better alternative to just simply returning a list of tuples?"
That's the approach used by the "tokenize" module for parsing Python source code. Returning a simple list of tuples can work very well.