String Constant Pool

Compiler behavior with String literals to Create String Constant Pool

  1. Is that what happens exactly which is described above?

Yes, conceptually, however, the constant pool and string pool are different things.

The constant pool is a part of a .class file that contains all constants used in this class.

The string pool is a runtime concept - interned strings and string literals are stored here.

Here's the JVM specification on the constant pool. It is part of the section on the .class format.


  1. How does the compiler creates object? As per my knowledge, objects are created at Run time and heap is a Run time memory area. So, how and where does String objects are created at the time of compilation!

How/when exactly this happens, I believe, is a JVM implementation-specific detail (correct me if I am wrong), but the basic explanation is that whenever the JVM decides to load a class, any strings found in the constant pool are automatically placed into the runtime string pool, and any duplicates are made to refer to the same instance.

In one of the linked answers' comments, Paŭlo Ebermann says:

when the classes are loaded in the VM, the string constants will get copied to the heap, to a VM-wide string pool

so it seems this is at least how Sun's VM implemented the string pool.

Prior to JDK 7/HotSpot interned strings were stored in the permanent generation space - now they are stored in the main heap.


  1. Source code can be compiled in one machine and run in a different machine. Or, even in the same machine they can be compiled and run in different time. Then how those objects (created in compile time) are recovered?

Constants are stored in the compiled files. Therefore they are retrievable whenever the JVM decides to load this class.


  1. What happens when we intern a String.

This is answered here:

doing String.intern() on a series of strings will ensure that all strings having same contents share same memory

Need to know about String, String Constant pool and String intern method

I will assume that in each example below you load and execute the code exactly once, in a new JVM each time. (I will also assume that nowhere else in your code do you use the literal "Java" ... since that would complicate things.)


1) Say if there are no strings in the String constant pool, and if i
say,

String s = "Java";

Then how many objects will be created ?

One string is created and added to the pool when method is loaded.


2) Now again nothing in the pool, and i say,

String s = new String("Java");

Now how many objects will be created.

One string is created and added to the pool when method is loaded.

A second string is created by the new when the code is run, and it is NOT added to the pool.


3) Now again nothing in the pool, and i say,

String s = new String("Java");
s.intern();

What will the intern method do ?

One string is created and added to the pool when method is loaded.

A second string is created by the new, and it is NOT added to the pool.

The intern call returns the first string. (You don't keep the reference ...)


4) Now again nothing in the pool, and i say,

String s = new String("Java");
String s1 = s.intern();

What will happen now?

Same as example 3. Thus, s1 will hold a reference to the String object that represents the "Java" string literal.


I read in SCJP5 Kathy Sierra book, that when you create a String with new, then 2 objects are created, one on the heap and one in the pool.

I doubt that the book said that exactly. (You are paraphrasing, and I think you have paraphrased somewhat inaccurately.)

However, your paraphrasing is roughly correct, though (an this is important!) the string object representing the literal is created and added to the pool when the code fragment is loaded1, not when it is executed.


And to address another point of confusion:

"What i actually meant was that from the answer that you gave, it seems that a String will always be added in the String constant pool."

That is incorrect. It is a false generalization.

While it is true for all 4 of the cases above, it will not be true for others. It depends on where the original string came from. In typical applications, most text data is read from a file, socket, or a user interface. When that happens, the strings are created from arrays of characters, either directly or via a library call.

Here is a simple (but unrealistic) example that shows creating a String from its component characters.

String s = new String(new char[]{'J', 'a', 'v', 'a'});

In the snippet above, only one String is created, and it is NOT in the String pool. If you wanted the resulting string to be in the string pool you need to explicitly call intern something like this:

String s = new String(new char[]{'J', 'a', 'v', 'a'});
s = s.intern();

... which will (if necessary) create a second string in the string pool2.


1 - Apparently, in some JVMs creation and interning string literals is done lazily, so it is not possible to say with 100% certainty when it actually happens. However, it will only occur once (per class that references the literal), no matter how many times the code fragment is executed by the JVM.

2 - There is no way to new a string into the string pool. It would actually be a violation of the JLS. The new operation is specified by the JLS as always creating a new object.

String pool vs Constant pool

My questions are,

  1. Does string pool refers to the pool of constant string object in the constant pool?

No.

"Constant pool" refers to a specially formatted collection of bytes in a class file that has meaning to the Java class loader. The "strings" in it are serialized, they are not Java objects. There are also many kinds of constants, not just strings in it.

See Chapter 4.4 the constant pool table

Java Virtual Machine instructions do not rely on the run-time layout of classes, interfaces, class instances, or arrays. Instead, instructions refer to symbolic information in the constant_pool table.

In contrast, the "String pool" is used at runtime (not just during class loading), contains only strings, and the "strings" in the string pool are java objects.
The "string pool" is a thread-safe weak-map from java.lang.String instances to java.lang.String instances used to intern strings.

Chapter 3.10.5. String Literals says

A string literal is a reference to an instance of class String (§4.3.1, §4.3.3).

Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.

Why JVM is not seeing duplicate String value in String Pool memory?

TL;DR - the point of your confusion is the Java memory model for Strings, namely the Heap and String Constant Pool areas of the memory.



Deep Dive into String memory model

Design Motivation

In Java, String is probably the most heavily used object. Because of this, Java maintains String objects with a special memory design strategy, holding them either in the Heap, in the isolated subset of the heap called String Constant Pool, or in both.

String Constant Pool is a special space in the Heap memory, which holds String objects of the unique "literal value"s. Anytime you create a String with its literal value, JVM first checks if the object of the same value is available in the String pool, and if it is, reference to the same object is returned, if it doesn't - the new object is allocated in the String Constant Pool, and the same happens for all other String literal creations again and again.

Reason, why having the Constant Pool is a good idea, is the semantics of this phrase itself - because it stores the constant and immutable String objects, and as you see, this is a great idea for the occasions when you might be creating many String objects with the same literal content - in all those cases, only one object for one "literal value" will be referenced each time and no newer objects will be created for the existing String literal object.

Note, that this is only possible because, String is immutable by definition. Also, note, that a pool of strings, which initially is empty, is maintained privately by the class String.

Where does Java place String objects?

Now this is where things get interesting. Important point to bear in mind, is that whenever you create String object with a new String() instruction, you force Java to allocate the new object into Heap; however, if you create a String object with the "string literal", it gets allocated in String Constant Pool. As we've said, the String Constant Pool exists mainly to reduce memory usage and improve the re-use of existing String objects in the memory.

So, if you'll write:

String s1 = "a";
String s2 = "a";
String s3 = new String("a");
  1. String object will be created into String Constant Pool and reference to that object will be stored into variable s1;
  2. String Constant Pool will be looked-up, and because of there is an object with the same literal value ("a") found in the pool, reference to the same object will be returned;
  3. String object will be explicitly created on the Heap area and the reference will be returned and stored into variable s3.

Internig Strings

If you wish to move the String object, created with new operator, into the String Constant Pool, you can invoke "your_string_text".intern(); method, and one of two will happen:

  1. if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool will be returned;
  2. otherwise, this String object will be added to the pool and a reference to this String object will be returned.


What happens in your code?

String s1 = "a";
String s2 = "a";

Both s1 and s2 will point to the same address in String pool and there will be only one object with value "a".

True. Initially, String object will be created and it will be placed into String Constant Pool. After that, as there is already String with value "a", no new object will be created for s2 and reference stored in s1 will be similarly stored into s2.

Now, let's finally have a look at your question:

String s1 = "a"; //allocated in the String Constant Pool
s1 = s1.concat("b"); //contact() returns a new String object, allocated in the Heap
String s2 = "ab";//"ab" still does NOT exist in the String Constant Pool, and it gets allocated there
System.out.println(s1 == s2); //returns false, because one object is in the Heap, and another is in the String Constant Pool, and as there already exists the object in the pool, with the same value, existing object will be returned by `intern()`.

If you will, however, execute

System.out.println(s1.intern() == s2);

this will return true, and I hope, by now, you understand - why. Because intern() will move the object referenced via s1 from Heap to the String Constant Pool.

How to implement our own string constant pool through a program in java?

Here's very simple implementation of object pool:

public class ObjectPool<T> {
private ConcurrentMap<T, T> map = new ConcurrentHashMap<>();

public T get(T object) {
T old = map.putIfAbsent( object, object );
return old == null ? object : old;
}
}

Now to create a pool of strings use

final ObjectPool<String> stringPool = new ObjectPool<>();

You can use it to deduplicate the strings in your program:

String deduplicatedStr = stringPool.get(str);


Related Topics



Leave a reply



Submit