How does an interpreter interpret the code?

后端 未结 4 626
有刺的猬
有刺的猬 2020-12-23 22:01

For simplicity imagine this scenario, we have a 2-bit computer, which has a pair of 2 bit registers called r1 and r2 and only works with immediate addressing.

Lets

4条回答
  •  萌比男神i
    2020-12-23 22:32

    The CPU architecture you describe is unfortunately too restricted to make this really clear with all the intermediate steps. Instead, I will write pseudo-C and pseudo-x86-assembler, hopefully in a way that is clear without being terribly familiar with C or x86.

    The compiled JVM bytecode might look something like this:

    ldc 0 # push first first constant (== 1)
    ldc 1 # push the second constant (== 2)
    iadd # pop two integers and push their sum
    istore_0 # pop result and store in local variable
    

    The interpreter has (a binary encoding of) these instructions in an array, and an index referring to the current instruction. It also has an array of constants, and a memory region used as stack and one for local variables. Then the interpreter loop looks like this:

    while (true) {
        switch(instructions[pc]) {
        case LDC:
            sp += 1; // make space for constant
            stack[sp] = constants[instructions[pc+1]];
            pc += 2; // two-byte instruction
        case IADD:
            stack[sp-1] += stack[sp]; // add to first operand
            sp -= 1; // pop other operand
            pc += 1; // one-byte instruction
        case ISTORE_0:
            locals[0] = stack[sp];
            sp -= 1; // pop
            pc += 1; // one-byte instruction
        // ... other cases ...
        }
    }
    

    This C code is compiled into machine code and run. As you can see, it's highly dynamic: It inspects each bytecode instruction each time that instruction is executed, and all values goes through the stack (i.e. RAM).

    While the actual addition itself probably happens in a register, the code surrounding the addition is rather different from what a Java-to-machine code compiler would emit. Here's an excerpt from what a C compiler might turn the above into (pseudo-x86):

    .ldc:
    incl %esi # increment the variable pc, first half of pc += 2;
    movb %ecx, program(%esi) # load byte after instruction
    movl %eax, constants(,%ebx,4) # load constant from pool
    incl %edi # increment sp
    movl %eax, stack(,%edi,4) # write constant onto stack
    incl %esi # other half of pc += 2
    jmp .EndOfSwitch
    
    .addi
    movl %eax, stack(,%edi,4) # load first operand
    decl %edi # sp -= 1;
    addl stack(,%edi,4), %eax # add
    incl %esi # pc += 1;
    jmp .EndOfSwitch
    

    You can see that the operands for the addition come from memory instead of being hardcoded, even though for the purposes of the Java program they are constant. That's because for the interpreter, they are not constant. The interpreter is compiled once and then must be able to execute all sorts of programs, without generating specialized code.

    The purpose of the JIT compiler is to do just that: Generate specialized code. A JIT can analyze the ways the stack is used to transfer data, the actual values of various constants in the program, and the sequence of calculations performed, to generate code that more efficiently does the same thing. In our example program, it would allocate the local variable 0 to a register, replace the access to the constant table with moving constants into registers (movl %eax, $1), and redirect the stack accesses to the right machine registers. Ignoring a few more optimizations (copy propagation, constant folding and dead code elimination) that would normally be done, it might end up with code like this:

    movl %ebx, $1 # ldc 0
    movl %ecx, $2 # ldc 1
    movl %eax, %ebx # (1/2) addi
    addl %eax, %ecx # (2/2) addi
    # no istore_0, local variable 0 == %eax, so we're done
    

提交回复
热议问题