What techniques promote efficient opcode dispatch to make a fast interpreter? Are there some techniques that only work well on modern hardware and others that don\'t work we
Benchmarking is a good technique on making anything fast on given platform(s). Test, refine, test again, improve.
I don't think you can get any better answer. There's lot of techniques to make interpreters. But I give you a tip, do not do trade offs, just choose what you really need and pursue those goals.