The code shown in the question is not a good performance measuring code:
The compiler might choose to optimize your code by reordering statements. Yes, it can do that. That means your entire test might fail. It can even choose to inline the method under test and reorder the measuring statements into the now-inlined code.
The hotspot might choose to reorder your statements, inline code, cache results, delay execution...
Even assuming the compiler/hotspot didn't trick you, what you measure is "wall time". What you should be measuring is CPU time (unless you use OS resources and want to include these as well or you measure lock contestation in a multi-threaded environment).
The solution? Use a real profiler. There are plenty around, both free profilers and demos / time-locked trials of commercials strength ones.