I\'ve got a few languages I\'ve been building as interpreters. When I\'m ready to take \"that next step\", what options are best for non-native compiled formats... what are
Code generation is my business :-)
Comments on a few options:
CLR:
LLVM:
C--
C as target language
Summary: anything except C is a reasonable choice. For the best combination of flexibility, quality, and expected longevity, I'd probably recommend LLVM.
Full disclosure: I am affiliated with the C-- project.
pro/cons:
CLR:
LLVM:
C as target language:
Java ByteCode as target:
From all the above, I think targeting Java ByteCode would probably be best for you.
EDIT: actually an answer to a comment, but 300chars are not enough.
JByteCode iffy - I agree (being a Smalltalker, JBytecode is too limiting for me).
VM-wise, I think there is a relatively wide range of performance you can get as JVM, starting at pure slow bytecode interpreters up to high end sophisticated JITting VMs (IBM). I guess, CLR VM's will catch up, as MS is stealing and integrating all innovation anyway sooner or later, and the techniques to speedup dynamic translation are published (read the Self papers, for example). LLVM will probably progress a bit slower, but who knows. With C, you will benefit from better compilers for free, but things like dynamic retranslation etc. are hard to implement with C as target. My own system uses a mixture of precompiled and dynamically compiled code (having all: a slow bytecode interpreter, JITter and precompiled static C-code in one memory space).
LLVM seems promising. The team claims better runtime performances on gcc with their backend compared to native. The ability to compile from the AST is really interesting (take a look at the tutorial). It can compile and optimize at runtime, which is a must for dynamic. It can also run as a pure interpreter.
I consider using LLVM in a project that involves creating a Tcl-like language. Tcl is heavily dynamic, so I don't know what this implies at this stage, but I'm confident that I'll get better performances than the current bytecode-based core.