A heap dump from OutOfMemoryError contains lots of garbage: how can it happen?

Native code (typically C/C++) invoked from Java code may need to access contents of Java objects. For example, compression code (zip library) and I/O code operates on byte[] arrays that are created by the Java code. When native code goes through the contents of a Java array, it is highly desirable from the performance standpoint that this code reads/writes bytes directly from the heap, using raw machine addresses. However, that would break the fundamental GC assumption: that at any time any Java object within the heap can be moved to a different location.

To address this problem, the JVM provides a mechanism that consists of critical sections and a piece of code called GCLocker. To get direct access to the contents of some Java object, native code has to enter a critical section by calling a method like GetPrimitiveArrayCritical(). When the code is done with this array, it should call ReleasePrimitiveArrayCritical(). And while any code is in the critical section, GCLocker blocks some or all GC activity (the exact details depend on the GC type and JDK version).

This mechanism works fine if critical sections are invoked infrequently and each takes very short time. It can, however, becomes a problem if many threads enter critical sections all the time, so that in the worst case, at any moment some thread is in a critical section. That may “paralyze” the GC for long periods of time, and may ultimately result in an OutOfMemoryError. If heap dump on OOM is enabled, you may observe a strange situation: the dump may be smaller than the maximum heap size (the -Xmx JVM option), and/or contain a lot of garbage. That’s the result of the GC being blocked and thus unable to free up space in memory for a new object. Further technical details about critical sections and GCLocker can be found here.

Even when GC is not completely blocked, the situation may get dire, at least with G1 GC. For some non-trivial reasons, when all the critical sections are finished and GCLocker initiates the overdue GC, it may invoke only Young GC. That happens even when there are very few Young regions available for collection, most of the objects are already promoted into the Old Gen, and thus only a mixed (if possible) or Full GC has a chance to clean up any significant amount of memory. Nevertheless, the JVM may stubbornly repeat Young GCs and eventually throw an OOM. This is described well in this article, along with one possible remedy.

Thus, if after an OutOfMemoryError you see a strange heap dump with a lot of garbage, you can do the following to check whether the problem is due to GCLocker:

Check gc.log for lines like the following:

[2024-04-24T21:03:59.271+0000][94283.865s][info ][gc ] GC(9242) Pause Young (Normal) (GCLocker Initiated GC) 175627M->173468M(177152M) 115.823ms

If there are many such lines, it means that the GC is often initiated by GCLocker as a “catch up” attempt after some critical section has finished
Check the thread stack traces in the heap dump, for threads that are running native code such as zip/unzip.

If it looks like the problem is indeed due to native code running in critical sections, you can try the following:

If your app runs on Java 8, switch to JDK 11 or 17 if possible. The implementation of G1 in newer JDKs is much more GCLocker-friendly, as we observed first hand when we switched brooklin to it. After the switch, faster GC resulted in nearly 40% throughput improvement and OOMs disappeared.
Consider avoiding code that uses critical sections – typically by disabling zip compression, switching to an implementation that uses Java rather than native code, or reducing the number of concurrent threads running native code.
If you are in the situation when an OOM is thrown instead of performing a Full GC, you may try the remedy suggested in the article mentioned above, which is the following two JVM options: -XX:+UnlockDiagnosticVMOptions -XX:GCLockerRetryAllocationCount=100
In addition, it is recommended to add the -XX:-G1UsePreventiveGC JVM option, which will stop the so-called “preventive GCs”. The latter was a failed attempt to make GC pauses shorter at the possible expense of making them more frequent. But in practice these preventive GCs may block Full GCs, which in turn may be needed in the above situation.
Increasing the heap might help to a certain extent, by giving the GC some “headroom” (while the code runs in critical section, objects can still be allocated – they just can’t be moved). However, it may not be wise to rely on this method in the long term, since increased/changed workload can easily break it.
The recent versions of GCs support “pinning” memory used in the critical sections, which means that other objects can be moved around freely. That’s how this problem should have been addressed from the beginning. Pinning is implemented in Shenandoah GC since Java 11, in ZGC since Java 17 and in G1 GC since Java 22.

Leave a Reply Cancel reply