How are these Java byte offsets calculated?

余生颓废 提交于 2019-12-11 02:28:54

问题


I have the following Java code:

public int sign(int a) {
  if(a<0) return -1;
  else if (a>0) return 1;
  else return 0;
}

which when compiled generated the following bytecode:

public int sign(int);
  Code:
     0: iload_1
     1: ifge          6
     4: iconst_m1
     5: ireturn
     6: iload_1
     7: ifle          12
    10: iconst_1
    11: ireturn
    12: iconst_0
    13: ireturn

I want to know how the byte offset count (the first column) is calculated, in particular, why is the byte count for the ifge and ifle instructions 3 bytes when all the other instructions are single byte instructions?


回答1:


As already pointed out in the comment: The ifge and ifle instructions have an additional offset.

The Java Virtual Machine Instruction Set specification for ifge and ifle contains the relevant hint here:

Format

if<cond>
branchbyte1
branchbyte2

This indicates that there are two additional bytes associated with this instruction, namely the "branch bytes". These bytes are composed to a single short value to determine the offset - namely, how far the instruction pointer should "jump" when the condition is satisfied.


Edit:

The comments made me curious: The offset is defined to be a signed 16 bit value, limiting the jumps to the range of +/- 32k. This does not cover the whole range of a possible method, which may contain up to 65535 bytes according to the code_length in the class file.

So I created a test class, to see what happens. This class looks like this:

class FarJump
{
    public static void main(String args[])
    {
        call(0, 1);
    }

    public static void call(int x, int y)
    {
        if (x < y)
        {
            y++;
            y++;

            ... (10921 times) ...

            y++;
            y++;
        }
        System.out.println(y);
    }

}

Each of the y++ lines will be translated into a iinc instruction, consisting of 3 bytes. So the resulting byte code is

public static void call(int, int);
    Code:
       0: iload_0
       1: iload_1
       2: if_icmpge     32768
       5: iinc          1, 1
       8: iinc          1, 1

       ...(10921 times) ...

    32762: iinc          1, 1
    32765: iinc          1, 1
    32768: getstatic     #3             // Field java/lang/System.out:Ljava/io/PrintStream;
    32771: iload_1
    32772: invokevirtual #4             // Method java/io/PrintStream.println:(I)V
    32775: return

One can see that it still uses an if_icmpge instruction, with an offset of 32768 (Edit: It is an absolute offset. The relative offset is 32766. Also see this question)

By adding a single more y++ in the original code, the compiled code suddenly changes to

public static void call(int, int);
    Code:
       0: iload_0
       1: iload_1
       2: if_icmplt     10
       5: goto_w        32781
      10: iinc          1, 1
      13: iinc          1, 1
      ....
    32770: iinc          1, 1
    32773: iinc          1, 1
    32776: goto_w        32781
    32781: getstatic     #3             // Field java/lang/System.out:Ljava/io/PrintStream;
    32784: iload_1
    32785: invokevirtual #4             // Method java/io/PrintStream.println:(I)V
    32788: return

So it reverses the condition from if_icmpge to if_icmplt, and handles the far jump with a goto_w instruction, that contains four branch bytes and can thus cover (more than) a full method range.




回答2:


The byte offsets can easily be calculated by summing up the size of each instruction before it. Instruction sizes are documented in the JVM specs.

The if<cond> instructions take up more space than the others because in addition to the single byte opcode, they have two extra bytes that specify the offset to jump to if the condition it true.

If you want to experiment further, you could for instance try using larger constants (like, say, 20) in your code. You'll see that the instructions to load those will also use up extra bytes to store the constant value. Commonly used small numbers have one-byte encodings (such as iconst_1) for efficiency.



来源:https://stackoverflow.com/questions/30240139/how-are-these-java-byte-offsets-calculated

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!