Java Heap Terminology: Young, Old and Permanent Generations

Java heap terminology: young, old and permanent generations?

This seems like a common misunderstanding. In Oracle's JVM, the permanent generation is not part of the heap. It's a separate space for class definitions and related data. In Java 6 and earlier, interned strings were also stored in the permanent generation. In Java 7, interned strings are stored in the main object heap.

Here is a good post on permanent generation.

I like the descriptions given for each space in Oracle's guide on JConsole:

For the HotSpot Java VM, the memory
pools for serial garbage collection
are the following.

  • Eden Space (heap): The pool from which memory is initially allocated
    for most objects.
  • Survivor Space (heap): The pool containing objects that have survived
    the garbage collection of the Eden
    space.
  • Tenured Generation (heap): The pool containing objects that have existed
    for some time in the survivor space.
  • Permanent Generation (non-heap): The pool containing all the reflective
    data of the virtual machine itself,
    such as class and method objects. With
    Java VMs that use class data sharing,
    this generation is divided into
    read-only and read-write areas.
  • Code Cache (non-heap): The HotSpot Java VM also includes a code cache,
    containing memory that is used for
    compilation and storage of native
    code.

Java uses generational garbage collection. This means that if you have an object foo (which is an instance of some class), the more garbage collection events it survives (if there are still references to it), the further it gets promoted. It starts in the young generation (which itself is divided into multiple spaces - Eden and Survivor) and would eventually end up in the tenured generation if it survived long enough.

Spark memory fraction vs Young Generation/Old Generation java heap split

The short answer is that they're not really related beyond both having to do with the JVM heap.

The better way to think of this is that there are four buckets (numbered in no significant order):

  1. Spark memory in the young gen
  2. Spark memory in the old gen
  3. User memory in the young gen
  4. User memory in the old gen

(technically there's also some system memory that's neither Spark nor User, but this typically is small enough to not worry about: this can also be either old or young).

Whether an object is classed as Spark or User is decided by Spark (I actually don't know if this is an eternal designation or if objects can change their categorization in this respect).

As for old vs. young, this is managed by the garbage collector and the GC can and will promote objects from young to old. In some GC algorithms, the sizes of the generations are dynamically adjusted (or they use fixed size regions and a given region can be old or young).

You have control of aggregate capacity of 1+2, 3+4, 1+3, and 2+4, but you don't really have (and probably don't really want, because there's a lot of benefit to being able to use excess space in one category to getting more space temporarily in another) control over the capacity of 1, 2, 3, or 4.

Young , Tenured and Perm generation

The Java garbage collector is referred to as a Generational Garbage Collector. Objects in an application live for varying lengths of time depending on where they are created and how they are used. The key insight here is that using different garbage collection strategies for short lived and long lived objects allows the GC to be optimised specifically for each case.

Loosely speaking, as objects "survive" repeated garbage collections in the Young Generation they are migrated to the Tenured Generation. The Permanent Generation is a special case, it contains objects, that are needed by the JVM, that are not necessarily represented in your program, for example objects that represent classes and methods.

Since the Young Generation will usually contain a lot of garbage in it, it is optimised for getting rid of a lot of unused objects at once. The Tenured Generation since it contains longer lived objects is optimised for speedy garbage collection without wasting a lot of memory.

With improvements in garbage collection technology the details have become pretty complex and vary depending on your JVM and how it has been configured. You should read the documentation for the specific JVM you are using if you need to know exactly what is happening.

That said, there is a simple historical arrangement this is still useful at a conceptual level. Historically the Young Generation would be a copy collector and the Tenured Generation be a mark and sweep collector. A copy collector pays essentially no CPU cost for getting rid of garbage, most of the cost is in maintaining live objects, the price of this efficiency is heavier memory usage. A mark and sweep collector pays some CPU cost for both live and unused objects but utilizes memory more efficiently.

Memory Managment: Young Gen Clarification

There are 2 kinds of GC collections

  1. Minor GC - this occurs when the young generation fills up
  2. Full GC - this occurs when the tenured or the old generation fills up.

    An OutOfMemory occurs when there is no space left on the heap to move objects into the old generation.

    You should read up more on Java GC process. You can start with - http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html

To read and analyze the GC logs you can refer to How to read a verbose:GC output?

How GC knows if object in old heap references an object in young heap?

Minor GC will collect the young generation but it doesn't mean that the GC will look only at the young generation heap area. The entire heap is considered and a reference from old generation to young generation will mark the object in young generation as alive.

This is described in Minor GC vs Major GC vs Full GC:

During a Minor GC event, Tenured generation is effectively ignored. References from tenured generation to young generation are considered de facto GC roots. References from young generation to Tenured generation are simply ignored during the markup phase.

Is permanent generation part of the heap or does it lies in a different space of itself in jvm

Original (perhaps mistaken) answer: If wikipedia is to be believed, it's part of the heap.

Edit: I've looked around at this more, including the site referenced in a comment by the OP. During this research I came across this SO question, which references this document, which indicates that for Sun Java (version 6), the permanent collection is actually outside the heap. That said, I'm no Java expert and wasn't previously aware of the memory management details at this level. If my reading is correct, the placement - or even the existence - of the permanent generation is a jvm implementation detail.



Related Topics



Leave a reply



Submit