问题
I have the following Java code:
public int sign(int a) {
if(a<0) return -1;
else if (a>0) return 1;
else return 0;
}
which when compiled generated the following bytecode:
public int sign(int);
Code:
0: iload_1
1: ifge 6
4: iconst_m1
5: ireturn
6: iload_1
7: ifle 12
10: iconst_1
11: ireturn
12: iconst_0
13: ireturn
I want to know how the byte offset count (the first column) is calculated, in particular, why is the byte count for the ifge
and ifle
instructions 3 bytes when all the other instructions are single byte instructions?
回答1:
As already pointed out in the comment: The ifge
and ifle
instructions have an additional offset.
The Java Virtual Machine Instruction Set specification for ifge and ifle contains the relevant hint here:
Format
if<cond> branchbyte1 branchbyte2
This indicates that there are two additional bytes associated with this instruction, namely the "branch bytes". These bytes are composed to a single short
value to determine the offset - namely, how far the instruction pointer should "jump" when the condition is satisfied.
Edit:
The comments made me curious: The offset
is defined to be a signed 16 bit value, limiting the jumps to the range of +/- 32k. This does not cover the whole range of a possible method, which may contain up to 65535 bytes according to the code_length in the class file.
So I created a test class, to see what happens. This class looks like this:
class FarJump
{
public static void main(String args[])
{
call(0, 1);
}
public static void call(int x, int y)
{
if (x < y)
{
y++;
y++;
... (10921 times) ...
y++;
y++;
}
System.out.println(y);
}
}
Each of the y++
lines will be translated into a iinc
instruction, consisting of 3 bytes. So the resulting byte code is
public static void call(int, int);
Code:
0: iload_0
1: iload_1
2: if_icmpge 32768
5: iinc 1, 1
8: iinc 1, 1
...(10921 times) ...
32762: iinc 1, 1
32765: iinc 1, 1
32768: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream;
32771: iload_1
32772: invokevirtual #4 // Method java/io/PrintStream.println:(I)V
32775: return
One can see that it still uses an if_icmpge
instruction, with an offset of 32768 (Edit: It is an absolute offset. The relative offset is 32766. Also see this question)
By adding a single more y++
in the original code, the compiled code suddenly changes to
public static void call(int, int);
Code:
0: iload_0
1: iload_1
2: if_icmplt 10
5: goto_w 32781
10: iinc 1, 1
13: iinc 1, 1
....
32770: iinc 1, 1
32773: iinc 1, 1
32776: goto_w 32781
32781: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream;
32784: iload_1
32785: invokevirtual #4 // Method java/io/PrintStream.println:(I)V
32788: return
So it reverses the condition from if_icmpge
to if_icmplt
, and handles the far jump with a goto_w instruction, that contains four branch bytes and can thus cover (more than) a full method range.
回答2:
The byte offsets can easily be calculated by summing up the size of each instruction before it. Instruction sizes are documented in the JVM specs.
The if<cond> instructions take up more space than the others because in addition to the single byte opcode, they have two extra bytes that specify the offset to jump to if the condition it true.
If you want to experiment further, you could for instance try using larger constants (like, say, 20) in your code. You'll see that the instructions to load those will also use up extra bytes to store the constant value. Commonly used small numbers have one-byte encodings (such as iconst_1
) for efficiency.
来源:https://stackoverflow.com/questions/30240139/how-are-these-java-byte-offsets-calculated