Java using much more memory than heap size (or size correctly Docker memory limit)
Virtual memory used by a Java process extends far beyond just Java Heap. You know, JVM includes many subsytems: Garbage Collector, Class Loading, JIT compilers etc., and all these subsystems require certain amount of RAM to function.
JVM is not the only consumer of RAM. Native libraries (including standard Java Class Library) may also allocate native memory. And this won't be even visible to Native Memory Tracking. Java application itself can also use off-heap memory by means of direct ByteBuffers.
So what takes memory in a Java process?
JVM parts (mostly shown by Native Memory Tracking)
- Java Heap
The most obvious part. This is where Java objects live. Heap takes up to -Xmx
amount of memory.
- Garbage Collector
GC structures and algorithms require additional memory for heap management. These structures are Mark Bitmap, Mark Stack (for traversing object graph), Remembered Sets (for recording inter-region references) and others. Some of them are directly tunable, e.g. -XX:MarkStackSizeMax
, others depend on heap layout, e.g. the larger are G1 regions (-XX:G1HeapRegionSize
), the smaller are remembered sets.
GC memory overhead varies between GC algorithms. -XX:+UseSerialGC
and -XX:+UseShenandoahGC
have the smallest overhead. G1 or CMS may easily use around 10% of total heap size.
- Code Cache
Contains dynamically generated code: JIT-compiled methods, interpreter and run-time stubs. Its size is limited by -XX:ReservedCodeCacheSize
(240M by default). Turn off -XX:-TieredCompilation
to reduce the amount of compiled code and thus the Code Cache usage.
- Compiler
JIT compiler itself also requires memory to do its job. This can be reduced again by switching off Tiered Compilation or by reducing the number of compiler threads: -XX:CICompilerCount
.
- Class loading
Class metadata (method bytecodes, symbols, constant pools, annotations etc.) is stored in off-heap area called Metaspace. The more classes are loaded - the more metaspace is used. Total usage can be limited by -XX:MaxMetaspaceSize
(unlimited by default) and -XX:CompressedClassSpaceSize
(1G by default).
- Symbol tables
Two main hashtables of the JVM: the Symbol table contains names, signatures, identifiers etc. and the String table contains references to interned strings. If Native Memory Tracking indicates significant memory usage by a String table, it probably means the application excessively calls String.intern
.
- Threads
Thread stacks are also responsible for taking RAM. The stack size is controlled by -Xss
. The default is 1M per thread, but fortunately the things are not so bad. OS allocates memory pages lazily, i.e. on the first use, so the actual memory usage will be much lower (typically 80-200 KB per thread stack). I wrote a script to estimate how much of RSS belongs to Java thread stacks.
There are other JVM parts that allocate native memory, but they do not usually play a big role in total memory consumption.
Direct buffers
An application may explicitly request off-heap memory by calling ByteBuffer.allocateDirect
. The default off-heap limit is equal to -Xmx
, but it can be overridden with -XX:MaxDirectMemorySize
. Direct ByteBuffers are included in Other
section of NMT output (or Internal
before JDK 11).
The amount of used direct memory is visible through JMX, e.g. in JConsole or Java Mission Control:
Besides direct ByteBuffers there can be MappedByteBuffers
- the files mapped to virtual memory of a process. NMT does not track them, however, MappedByteBuffers can also take physical memory. And there is no a simple way to limit how much they can take. You can just see the actual usage by looking at process memory map: pmap -x <pid>
Address Kbytes RSS Dirty Mode Mapping
...
00007f2b3e557000 39592 32956 0 r--s- some-file-17405-Index.db
00007f2b40c01000 39600 33092 0 r--s- some-file-17404-Index.db
^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
Native libraries
JNI code loaded by System.loadLibrary
can allocate as much off-heap memory as it wants with no control from JVM side. This also concerns standard Java Class Library. In particular, unclosed Java resources may become a source of native memory leak. Typical examples are ZipInputStream
or DirectoryStream
.
JVMTI agents, in particular, jdwp
debugging agent - can also cause excessive memory consumption.
This answer describes how to profile native memory allocations with async-profiler.
Allocator issues
A process typically requests native memory either directly from OS (by mmap
system call) or by using malloc
- standard libc allocator. In turn, malloc
requests big chunks of memory from OS using mmap
, and then manages these chunks according to its own allocation algorithm. The problem is - this algorithm can lead to fragmentation and excessive virtual memory usage.
jemalloc
, an alternative allocator, often appears smarter than regular libc malloc
, so switching to jemalloc
may result in a smaller footprint for free.
Conclusion
There is no guaranteed way to estimate full memory usage of a Java process, because there are too many factors to consider.
Total memory = Heap + Code Cache + Metaspace + Symbol tables +
Other JVM structures + Thread stacks +
Direct buffers + Mapped files +
Native Libraries + Malloc overhead + ...
It is possible to shrink or limit certain memory areas (like Code Cache) by JVM flags, but many others are out of JVM control at all.
One possible approach to setting Docker limits would be to watch the actual memory usage in a "normal" state of the process. There are tools and techniques for investigating issues with Java memory consumption: Native Memory Tracking, pmap, jemalloc, async-profiler.
Update
Here is a recording of my presentation Memory Footprint of a Java Process.
In this video, I discuss what may consume memory in a Java process, how to monitor and restrain the size of certain memory areas, and how to profile native memory leaks in a Java application.
Java memory settings in a docker container
whether, in general terms, setting the -Xmx setting is too high for a container whose memory limit is 20GB
It depends. An application can use less RAM than the specified -Xmx
, as well as 2x or 3x more RAM than the specified -Xmx
. I've seen many applications of both kinds.
See what takes memory in a Java process.
Instead of trying to guess an appropriate heap size basing on the given container limit (those can be completely unrelated), why don't you set -Xmx
to the value, which is comfortable enough for your particular application? I mean, if your application works fine with just 8 GB heap, there is no need to give it more, even if the container permits.
Is it generally the case that you use either MaxRAM or -Xmx, or is there any benefit to setting both?
Setting both is meaningless, as Xmx
overrides MaxRAM
.
I were to use MaxRAM instead of -Xmx, how would java allocate memory to the heap? Is there a simple algorithm for this, for example, 50% of the MaxRAM setting?
See What is the difference between xmx and MaxRAM JVM parameters?
Will java manage memory any more efficiently doing it that way?
No. MaxRAM
only affects calculation of the heap size and the default garbage collector (when not explicitly specified).
How to determine which memory to use for a certain service heap memory?
The exact amount of memory needed for the heap is very specific to the application - you are better of trying multiple configurations and see yourself.
However, needing 4 GB for something that consumes about 600 MB is quite unusual. What are the other things that consume the host memory?
Note that JVM uses much more than just heap memory and this again can vary a lot from application to application.
For an overview of various types of memory used by JVM I recommend this excellent answer: Java using much more memory than heap size (or size correctly Docker memory limit)
Finally, you can adjust the default allocation strategy (25% of available RAM) via -XX:MaxRAMPercentage
flag - I typically set this to 60%, but for sure measure first!
Docker - Java container doesn't respect memory limits
To restrict memory for docker containers, you need to add following in your docker-compose.yml file
environment:
- <name>=<value>
deploy:
resources:
memory: 256M
Once you add the above code, then containers would be restricted to given memory limit and you can see the result in docker stats
output. For more details about the resource restriction refer this docker documentation.
Note: If the container crosses the given memory limit, the kernel will kill the container.
G1 GC allocated memory is more than heap size
This is your JVM parameters ,the number is byte. 23923261440(byte) = 22815(Mb) = 22.28(Gb)
-XX:InitialHeapSize=23923261440
-XX:MaxHeapSize=23923261440
By default, the space of G1's Young Gen
and Old Gen
is not fixed.From the Interactive Graphs
column of the GCEasy page you shared,
I have organized the following information in a timeline.
GC Times | Young Gen(Mb) | Old Gen(Mb) | Total(Mb) |
---|---|---|---|
First | 1232 | 21584 | 22816 |
Fifth | 1136 | 21680 | 22816 |
33th | 5384 | 17432 | 22816 |
35th | 1200 | 21616 | 22816 |
53th | 13696 | 9120 | 22816 |
Last | 13688 | 9128 | 22816 |
Java memory off heap inside container
should I use nonHeap+heap to calculate the total usage by my application.
No. Total physical memory used by a Java process includes many things not counted by the JVM.
See this answer and this presentation for details.
My main doubt is about "freePhysicalMemorySizeMB", this value variates a lot between containers even when the committed memories (heap+nonHeap) have the same value .
Just don't look at getFreePhysicalMemorySize()
. This value is rarely useful. It does not take into account page cache and various reclaimable OS structures that can be automatically freed when the application requests more memory.
The most meaningful measure of the process' memory usage is RSS - the resident size. An easy way to find the application's own RSS on Linux is to read /proc/self/stat
.
Should I reserve more memory beyond the estimate of heap+NonHeap? If yes, is it possible to know how much?
An application may require more memory than Heap+NonHeap. It's not easy to guess the right number statically, so your best bet is to watch the actual RSS and adjust the limits correspondingly.
Is it the term "NonHeap" the same as "off-heap"?
Depends on the context. For example, in jconsole
and some other tools, Non-Heap
denotes certain JVM structures: Metaspace, Compress Class Space, and Code Cache. Off-heap
is usually a broader term, which basically means everything but Java Heap.
Related Topics
Java Equivalent to PHP'S Preg_Replace_Callback
Unfortunately Myapp Has Stopped. How to Solve This
How to Render a Pdf File in Android
Working Post Multipart Request With Volley and Without Httpentity
Sort a Map≪Key, Value≫ by Values
Change Date Format in a Java String
How to Determine Whether an Array Contains a Particular Value in Java
Setting the Default Java Character Encoding
How to Increment a Date by One Day in Java
Migrating from Jsf 1.2 to Jsf 2.0
Signing Pdfs on a Server Document Using a Signature from the User
How to Fix 'Android.Os.Networkonmainthread
exception'
Swing Gui Listeners Without Awt
Assigning Variables With Dynamic Names in Java
How to Avoid Arrayindexoutofboundsexception or Indexoutofboundsexception
How to Read a Single Char from the Console in Java (As the User Types It)