As a sort of follow up to the question called Differences between MSIL and Java bytecode?, what is the (major) differences or similarity in how the Java Virtual Machine work
It is not a virtual machine, the .net framework compiles the assemblies into native binary at the time of the first run:
In computing, just-in-time compilation (JIT), also known as dynamic translation, is a technique for improving the runtime performance of a computer program. JIT builds upon two earlier ideas in run-time environments: bytecode compilation and dynamic compilation. It converts code at runtime prior to executing it natively, for example bytecode into native machine code. The performance improvement over interpreters originates from caching the results of translating blocks of code, and not simply reevaluating each line or operand each time it is met (see Interpreted language). It also has advantages over statically compiling the code at development time, as it can recompile the code if this is found to be advantageous, and may be able to enforce security guarantees. Thus JIT can combine some of the advantages of interpretation and static (ahead-of-time) compilation.
Several modern runtime environments, such as Microsoft's .NET Framework, most implementations of Java, and most recently Actionscript 3, rely on JIT compilation for high-speed code execution.
Source: http://en.wikipedia.org/wiki/Just-in-time_compilation
Adding up .NET framework contains a virtual machine, just like Java.