Why does the JVM require warmup?

问题

I understand that in the Java virtual machine (JVM), warmup is potentially required as Java loads classes using a lazy loading process and as such you want to ensure that the objects are initialized before you start the main transactions. I am a C++ developer and have not had to deal with similar requirements.

However, the parts I am not able to understand are the following:

Which parts of the code should you warm up?
Even if I warm up some parts of the code, how long does it remain warm (assuming this term only means how long your class objects remain in-memory)?
How does it help if I have objects which need to be created each time I receive an event?

Consider for an example an application that is expected to receive messages over a socket and the transactions could be New Order, Modify Order and Cancel Order or transaction confirmed.

Note that the application is about High Frequency Trading (HFT) so performance is of extreme importance.

回答1:

Which parts of the code should you warm up?

Usually, you don't have to do anything. However for a low latency application, you should warmup the critical path in your system. You should have unit tests, so I suggest you run those on start up to warmup up the code.

Even once your code is warmed up, you have to ensure your CPU caches stay warm as well. You can see a significant slow down in performance after a blocking operation e.g. network IO, for up to 50 micro-seconds. Usually this is not a problem but if you are trying to stay under say 50 micro-seconds most of the time, this will be a problem most of the time.

Note: Warmup can allow Escape Analysis to kick in and place some objects on the stack. This means such objects don't need to be optimised away. It is better to memory profile your application before optimising your code.

Even if I warm up some parts of the code, how long does it remain warm (assuming this term only means how long your class objects remain in-memory)?

There is no time limit. It depends on whether the JIt detects whether the assumption it made when optimising the code turned out to be incorrect.

How does it help if I have objects which need to be created each time I receive an event?

If you want low latency, or high performance, you should create as little objects as possible. I aim to produce less than 300 KB/sec. With this allocation rate you can have an Eden space large enough to minor collect once a day.

Consider for an example an application that is expected to receive messages over a socket and the transactions could be New Order, Modify Order and Cancel Order or transaction confirmed.

I suggest you re-use objects as much as possible, though if it's under your allocation budget, it may not be worth worrying about.

Note that the application is about High Frequency Trading (HFT) so performance is of extreme importance.

You might be interested in our open source software which is used for HFT systems at different Investment Banks and Hedge Funds.

http://chronicle.software/

My production application is used for High frequency trading and every bit of latency can be an issue. It is kind of clear that at startup if you don't warmup your application, it will lead to high latency of few millis.

In particular you might be interested in https://github.com/OpenHFT/Java-Thread-Affinity as this library can help reduce scheduling jitter in your critical threads.

And also it is said that the critical sections of code which requires warmup should be ran (with fake messages) atleast 12K times for it to work in an optimized manner. Why and how does it work?

Code is compiled using background thread(s). This means that even though a method might be eligible for compiling to native code, it doesn't mean that it has done so esp on startup when the compiler is pretty busy already. 12K is not unreasonable, but it could be higher.

回答2:

Warming refers to having a piece of code run enough times that the JVM stops interpreting and compiles to native (at least for the first time). Generally that's something you don't want to do. The reason is that the JVM gathers statistics about the code in question that it uses during code generation (akin to profile guided optimizations). So if a code chunk in question is "warmed" with fake data which has different properties than the real data you could well be hurting performance.

EDIT: Since the JVM cannot perform whole-program static analysis (it can't know what code is going to be loaded by the application) it can instead make some guesses about types from the statistics it has gathered. As an example when calling a virtual function (in C++ speak) at an exact calling location and it determines that all types have the same implementation, then the call is promoted to direct call (or even inlined). If later that assumption if proven to be wrong, then the old code must be "uncompiled" to behave properly. AFAIK HotSpot classifies call-sites as monomorphic (single implementation), bi-morphic (exactly two..transformed into if (imp1-type) {imp1} else {imp2} ) and full polymorphic..virtual dispatch.

And there's another case in which recompiling occurs..when you have tiered-compilation. The first tier will spend less time trying to produce good code and if the method is "hot-enough" then the more expensive compile-time code generator kicks in.

回答3:

Warm-up is rarely required. It's relevant when doing for example performance tests, to make sure that the JIT warm-up time doesn't skew the results.

In normal production code you rarely see code that's meant for warm-up. The JIT will warm up during normal processing, so there's very little advantage to introduce additional code just for that. In the worst case you might be introducing bugs, spending extra development time and even harming performance.

Unless you know for certain that you need some kind of warm-up, don't worry about it. The example application you described certainly doesn't need it.

回答4:

Why JVM requires warmup?

Modern (J)VMs gather statistics at runtime about which code is used most often and how it is used. One (of hundreds if not thousands) example is optimization of calls to virtual functions (in C++ lingo) which have only on implementation. Those statistics can by their definition only gathered at run time.

Class loading itself is part of the warm up as well, but it obviously automatically happens before the execution of code inside those classes, so there is not much to worry about

Which parts of the code should you warmup?

The part that is crucial for the performance of your application. The important part is to "warm it up" just the same way as it is used during normal usage, otherwise the wrong optimizations will be done (and undone later on).

Even if I warmup some parts of the code, how long does it remain warm (assuming this term only means how long your class objects remain in-memory)?

This is really hard to say basically the JIT compiler constantly monitors execution and performance. If some threshhold is reached it will try to optimize things. It will then continue to monitor performance to verify that the optimization actually helps. If not it might unoptimize the code. Also things might happen, that invalidate optimizations, like loading of new classes. I'd consider those things not predictable, at least not based on a stackoverflow answer, but there are tools the tell you what the JIT is doing: https://github.com/AdoptOpenJDK/jitwatch

How does it help if I have objects which need to be created each time I receive an event.

One simple example could be: you create objects inside a method, since a reference leaves the scope of the method, those objects will get stored on the heap, and eventually collected by the garbage collector. If the code using those objects is heavily used, it might end up getting inlined in a single big method, possibly reordered beyond recognition, until these Objects only live inside this method. At that point they can be put on the stack and get removed when the method exits. This can save huge amounts of garbage collection and will only happen after some warm up.

With all that said: I'm skeptical on the notion that one needs to do anything special for warming up. Just start your application, and use it and the JIT compiler will do it's thing just fine. If you experience problems, then learn what the JIT does with your application and how to fine tune that behavior or how to write your application so that it benefits the most.

The only case where I actually know about the need for warm up are benchmarks. Because if you neglect it there you will get bogus results almost guaranteed.

回答5:

Which parts of the code should you warmup?

There is no answer to this question in general. It depends entirely on your application.

Even if I warmup some parts of the code, how long does it remain warm (assuming this term only means how long your class objects remain in-memory)?

Objects remain in memory for as long as your program has a reference to them, absent any special weak-reference use or something similar. Learning about when your program "has a reference" to something can be a little more obscure than you might think at first glance, but it is the basis for memory management in Java and worth the effort.

How does it help if I have objects which need to be created each time I receive an event.

This is entirely dependent on the application. There is no answer in general.

I encourage you to study and work with Java to understand things like classloading, memory management, and performance monitoring. It takes some amount of time to instantiate an object, in general it takes more time to load a class (which, of course, is usually done far less often). Usually, once a class is loaded, it stays in memory for the life of the program -- this is the sort of thing that you should understand, not just get an answer to.

There are also techniques to learn if you don't know them already. Some programs use "pools" of objects, instantiated before they're actually needed, then handed off to do processing once the need arises. This allows a time-critical portion of the program to avoid the time spent instantiating during the time-critical period. The pools maintain a collection of objects (10? 100? 1000? 10000?), and instantiate more if needed, etc. But managing the pools is a significant programming effort, and, of course, you occupy memory with the objects in the pools.

It would be entirely possible to use up enough memory to trigger garbage collection much more often, and SLOW THE SYSTEM YOU WERE INTENDING TO SPEED UP. This is why you need to understand how it works, not just "get an answer".

Another consideration -- by far most of the effort put into making programs faster is wasted, as in not needed. Without extensive experience with the application being considered, and/or measurement of the system, you simply do not know where (or whether) optimization will even be noticeable. System/program design to avoid pathological cases of slowness ARE useful, and don't take nearly the time and effort of 'optimization'. Most of the time it is all any of us need.

-- edit -- add just-in-time compilation to the list of things to study and understand.

回答6:

It is all about JIT compiler, which is used on the JVM to optimize bytecode in the runtime (because javac can't use advanced or agressive optimization technics due to platform-independent nature of the bytecode)

you can warmup the code that will process your messages. Actually, in most cases you don't need do to it by special warm-up cycles: just let the application to start and process some of the first messages - JVM will try to do its best to analyse code execution and make optimizations :) Manual warm-up with fake samples can yield even worse results
code will be optimized after some amount of time and will be optimized until some event in the program-flow would degradate code state (after it JIT compiler will try to optimize the code again - this process never ends)
short-living objects are subjects to be optimized too but generally it should help your message processing tenured code to be more efficient

回答7:

I always pictured it like the following:

You as (a C++ developer) could imagine an automated iterative approach by the jvm compiling/hotloading/replacing various bits an pieces with (the imaginary analog of) gcc -O0,-O1,-O2,-O3 variants (and sometimes reverting them if it deems it neccessary)

I'm sure this it not strictly what happens but might be an useful analogy for a C++ dev.

On a standard jvm the times it takes for a snippet to be considered for jit is set by -XX:CompileThreshold which is 1500 by default. (Sources and jvm versions vary - but I think thats for jvm8)

Further a book which I have at hand states under Host Performace JIT Chapter (p59) that the following optimizations are done during JIT:

Inlining
Lock elimination
Virtual call elimination
Non-volatile memory write elimination
Native code generation

EDIT:

regarding comments

I think 1500 may be just enough to hint to JIT that it should compile the code into native and stop interpreting. would you agree?

I don't know if its just a hint, but since openjdk is open-source lets look at the various limits and numbers in globals.hpp#l3559@ver-a801bc33b08c (for jdk8u)

(I'm not a jvm dev this might be the completly wrong place to look)

Compiling a code into native does not necessarily mean it is also optimized.

To my understanding - true; especially if you mean -Xcomp (force compile) - this blog even states that it prevents the jvm from doing any profiling - hence optimizing - if you do not run -Xmixed (the default).

So a timer kicks in to sample frequently accessed native code and optimize the same. Do you know how we can control this timer interval?

I really don't know the details, but the gobals.hpp I linked indeed defines some frequency intervals.

来源：https://stackoverflow.com/questions/36198278/why-does-the-jvm-require-warmup

标签

java

garbage-collection

jvm

low-latency

hft