What Is String Pool in Java

What is the Java string pool and how is s different from new String( s )?

The string pool is the JVM's particular implementation of the concept of string interning:

In computer science, string interning
is a method of storing only one copy
of each distinct string value, which
must be immutable. Interning strings
makes some string processing tasks
more time- or space-efficient at the
cost of requiring more time when the
string is created or interned. The
distinct values are stored in a string
intern pool.

Basically, a string intern pool allows a runtime to save memory by preserving immutable strings in a pool so that areas of the application can reuse instances of common strings instead of creating multiple instances of it.

As an interesting side note, string interning is an example of the flyweight design pattern:

Flyweight is a software design
pattern. A flyweight is an object that
minimizes memory use by sharing as
much data as possible with other
similar objects; it is a way to use
objects in large numbers when a simple
repeated representation would use an
unacceptable amount of memory.

What is String pool in Java?

This prints true (even though we don't use equals method: correct way to compare strings)

    String s = "a" + "bc";
String t = "ab" + "c";
System.out.println(s == t);

When compiler optimizes your string literals, it sees that both s and t have same value and thus you need only one string object. It's safe because String is immutable in Java.

As result, both s and t point to the same object and some little memory saved.

Name 'string pool' comes from the idea that all already defined string are stored in some 'pool' and before creating new String object compiler checks if such string is already defined.

How Java String pool works when String concatenation?


"When the string is created by concatenation does java make something
different or simple == comparator have another behaviour?"

No it does not change its behavior, what happens is that:

When concatenating two string literals "a" + "b" the jvm joins the two values and then check the string pool, then it realizes the value already exists in the pool so it just simply assign this reference to the String. now in more details:

Look at the compiled bytecode below of this simple program:

public class Test  {    
public static void main(String... args) {
String a = "hello world!";
String b = "hello" + " world!";
boolean compare = (a == b);
}
}

Simple program

First the JVM loads the string "hello world! and then push it to string pool (in this case) and then loads it to the stack (ldc = Load constant) [see point 1 in Image]

Then it assign the reference created in the pool to the local variable (astore_1) [see point 2 in Image]

Notice that the reference created in the string pool for this literal is #2 [See point 3 in Image]

The next operation is about the same: in concatenates the string, push it to the runtime constant pool (string pool in this case), but then it realizes a literal with the same content already exists so it uses this reference (#2) and assign in to a local variable (astore_2).

Thus when you do (a == b) is true because both of them are referencing to the string pool #2 which is "hello world!".

Your example C is kind of different tho, because you're using the += operator which when compiled to bytecode it uses StringBuilder to concatenate the strings, so this creates a new instance of StringBuilder Object thus pointing to a different reference. (string pool vs Object)

Java String Pool with String constructor and the intern function

You wrote

String c = new String("foo"); // Creates a new string in the heap

I read somewhere that even when using the constructor, the String Pool is being used. It
will insert the string into the String Pool and into the heap.

That’s somewhat correct, but you have to read the code correctly. Your code contains two String instances. First, you have the string literal "foo" that evaluates to a String instance, the one that will be inserted into the pool. Then, you are creating a new String instance explicitly, using new String(…) calling the String(String) constructor. Since the explicitly created object can’t have the same identity as an object that existed prior to its creation, two String instances must exist.

Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.

Well it does so, because you told it so. In theory, this construction could get optimized, skipping the intermediate step that you can’t perceive anyway. But the first assumption for a program’s behavior should be that it does precisely what you have written.

You could ask why there’s a constructor that allows such a pointless operation. In fact, this has been asked before and this answer addresses this. In short, it’s mostly a historical design mistake, but this constructor has been used in practice for other technical reasons; some do not apply anymore. Still, it can’t be removed without breaking compatibility.

String s = new String("Hello");
s = s.intern();

Will the garbage collector delete the string that is outside the String Pool from the heap?

Since the intern() call will evaluate to the instance that had been created for "Hello" and is distinct from the instance created via new String(…), the latter will definitely be unreachable after the second assignment to s. Of course, this doesn’t say whether the garbage collector will reclaim the string’s memory only that it is allowed to do so. But keep in mind that the majority of the heap occupation will be the array that holds the character data, which will be shared between the two string instances (unless you use a very outdated JVM). This array will still be in use as long as either of the two strings is in use. Recent JVMs even have the String Deduplication feature that may cause other strings of the same contents in the JVM use this array (to allow collection of their formerly used array). So the lifetime of the array is entirely unpredictable.

How is string pool measured in terms of buckets in java


So, what are buckets in String pool?

The String pool is basically a hash table. A hash table contains buckets or slots.

How are these comparable to the number of interned Strings?

It's implementation-defined (JVM-specific) and depends on how many entries a single bucket stores. Ideally, one bucket keeps one entry.

Is the concept similar to buckets in hashmaps?

Yes, it's the same idea.

Why is the default pool size growing? (my question)

The more buckets are allocated, the lower the load factor gets, which positively affects performance. I guess the initial number of entries occupied in the table grows, so it's important to keep the load factor updated (at least at the same level).

How many objects will be created in string pool?


  1. String s = "abc"; → one object, that goes into the string pool, as the literal "abc" is used;
  2. s = ""; → one empty string ("") object, and again - allocated in the string pool;
  3. String s2 = new String("mno"); → another object created with an explicit new keyword, and note, that it actually involves yet another literal object (again - created in the string pool) - "mno"; overall, two objects here;
  4. s2 = "pqr"; → yet another object, being stored into the string pool.

So, there are 5 objects in total; 4 in the string pool (a.k.a. "intern pool"), and one in the ordinary heap.

Remember, that anytime you use "string literal", JVM first checks whether the same string object (according to String::equals..()) exists in the string pool, and it then does one of the following:

  1. If corresponding string does not exist, JVM creates a string object and puts it in the string pool. That string object is a candidate to be reused, by JVM, anytime equal to it (again, according to String::equals(..)) string literal is referenced (without explicit new);
  2. If corresponding string exists, its reference is just being returned, without creating anything new.

Why JVM is not seeing duplicate String value in String Pool memory?

TL;DR - the point of your confusion is the Java memory model for Strings, namely the Heap and String Constant Pool areas of the memory.



Deep Dive into String memory model

Design Motivation

In Java, String is probably the most heavily used object. Because of this, Java maintains String objects with a special memory design strategy, holding them either in the Heap, in the isolated subset of the heap called String Constant Pool, or in both.

String Constant Pool is a special space in the Heap memory, which holds String objects of the unique "literal value"s. Anytime you create a String with its literal value, JVM first checks if the object of the same value is available in the String pool, and if it is, reference to the same object is returned, if it doesn't - the new object is allocated in the String Constant Pool, and the same happens for all other String literal creations again and again.

Reason, why having the Constant Pool is a good idea, is the semantics of this phrase itself - because it stores the constant and immutable String objects, and as you see, this is a great idea for the occasions when you might be creating many String objects with the same literal content - in all those cases, only one object for one "literal value" will be referenced each time and no newer objects will be created for the existing String literal object.

Note, that this is only possible because, String is immutable by definition. Also, note, that a pool of strings, which initially is empty, is maintained privately by the class String.

Where does Java place String objects?

Now this is where things get interesting. Important point to bear in mind, is that whenever you create String object with a new String() instruction, you force Java to allocate the new object into Heap; however, if you create a String object with the "string literal", it gets allocated in String Constant Pool. As we've said, the String Constant Pool exists mainly to reduce memory usage and improve the re-use of existing String objects in the memory.

So, if you'll write:

String s1 = "a";
String s2 = "a";
String s3 = new String("a");
  1. String object will be created into String Constant Pool and reference to that object will be stored into variable s1;
  2. String Constant Pool will be looked-up, and because of there is an object with the same literal value ("a") found in the pool, reference to the same object will be returned;
  3. String object will be explicitly created on the Heap area and the reference will be returned and stored into variable s3.

Internig Strings

If you wish to move the String object, created with new operator, into the String Constant Pool, you can invoke "your_string_text".intern(); method, and one of two will happen:

  1. if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool will be returned;
  2. otherwise, this String object will be added to the pool and a reference to this String object will be returned.


What happens in your code?

String s1 = "a";
String s2 = "a";

Both s1 and s2 will point to the same address in String pool and there will be only one object with value "a".

True. Initially, String object will be created and it will be placed into String Constant Pool. After that, as there is already String with value "a", no new object will be created for s2 and reference stored in s1 will be similarly stored into s2.

Now, let's finally have a look at your question:

String s1 = "a"; //allocated in the String Constant Pool
s1 = s1.concat("b"); //contact() returns a new String object, allocated in the Heap
String s2 = "ab";//"ab" still does NOT exist in the String Constant Pool, and it gets allocated there
System.out.println(s1 == s2); //returns false, because one object is in the Heap, and another is in the String Constant Pool, and as there already exists the object in the pool, with the same value, existing object will be returned by `intern()`.

If you will, however, execute

System.out.println(s1.intern() == s2);

this will return true, and I hope, by now, you understand - why. Because intern() will move the object referenced via s1 from Heap to the String Constant Pool.



Related Topics



Leave a reply



Submit