Is String Literal Pool a collection of references to the String Object, Or a collection of Objects
I think the main point to understand here is the distinction between String
Java object and its contents - char[]
under private value
field. String
is basically a wrapper around char[]
array, encapsulating it and making it impossible to modify so the String
can remain immutable. Also the String
class remembers which parts of this array is actually used (see below). This all means that you can have two different String
objects (quite lightweight) pointing to the same char[]
.
I will show you few examples, together with hashCode()
of each String
and hashCode()
of internal char[] value
field (I will call it text to distinguish it from string). Finally I'll show javap -c -verbose
output, together with constant pool for my test class. Please do not confuse class constant pool with string literal pool. They are not quite the same. See also Understanding javap's output for the Constant Pool.
Prerequisites
For the purpose of testing I created such a utility method that breaks String
encapsulation:
private int showInternalCharArrayHashCode(String s) {
final Field value = String.class.getDeclaredField("value");
value.setAccessible(true);
return value.get(s).hashCode();
}
It will print hashCode()
of char[] value
, effectively helping us understand whether this particular String
points to the same char[]
text or not.
Two string literals in a class
Let's start from the simplest example.
Java code
String one = "abc";
String two = "abc";
BTW if you simply write "ab" + "c"
, Java compiler will perform concatenation at compile time and the generated code will be exactly the same. This only works if all strings are known at compile time.
Class constant pool
Each class has its own constant pool - a list of constant values that can be reused if they occur several times in the source code. It includes common strings, numbers, method names, etc.
Here are the contents of the constant pool in our example above.
const #2 = String #38; // abc
//...
const #38 = Asciz abc;
The important thing to note is the distinction between String
constant object (#2
) and Unicode encoded text "abc"
(#38
) that the string points to.
Byte code
Here is generated byte code. Note that both one
and two
references are assigned with the same #2
constant pointing to "abc"
string:
ldc #2; //String abc
astore_1 //one
ldc #2; //String abc
astore_2 //two
Output
For each example I am printing the following values:
System.out.println(showInternalCharArrayHashCode(one));
System.out.println(showInternalCharArrayHashCode(two));
System.out.println(System.identityHashCode(one));
System.out.println(System.identityHashCode(two));
No surprise that both pairs are equal:
23583040
23583040
8918249
8918249
Which means that not only both objects point to the same char[]
(the same text underneath) so equals()
test will pass. But even more, one
and two
are the exact same references! So one == two
is true as well. Obviously if one
and two
point to the same object then one.value
and two.value
must be equal.
Literal and new String()
Java code
Now the example we all waited for - one string literal and one new String
using the same literal. How will this work?
String one = "abc";
String two = new String("abc");
The fact that "abc"
constant is used two times in the source code should give you some hint...
Class constant pool
Same as above.
Byte code
ldc #2; //String abc
astore_1 //one
new #3; //class java/lang/String
dup
ldc #2; //String abc
invokespecial #4; //Method java/lang/String."<init>":(Ljava/lang/String;)V
astore_2 //two
Look carefully! The first object is created the same way as above, no surprise. It just takes a constant reference to already created String
(#2
) from the constant pool. However the second object is created via normal constructor call. But! The first String
is passed as an argument. This can be decompiled to:
String two = new String(one);
Output
The output is a bit surprising. The second pair, representing references to String
object is understandable - we created two String
objects - one was created for us in the constant pool and the second one was created manually for two
. But why, on earth the first pair suggests that both String
objects point to the same char[] value
array?!
41771
41771
8388097
16585653
It becomes clear when you look at how String(String)
constructor works (greatly simplified here):
public String(String original) {
this.offset = original.offset;
this.count = original.count;
this.value = original.value;
}
See? When you are creating new String
object based on existing one, it reuses char[] value
. String
s are immutable, there is no need to copy data structure that is known to be never modified.
I think this is the clue of your problem: even if you have two String
objects, they might still point to the same contents. And as you can see the String
object itself is quite small.
Runtime modification and intern()
Java code
Let's say you initially used two different strings but after some modifications they are all the same:
String one = "abc";
String two = "?abc".substring(1); //also two = "abc"
The Java compiler (at least mine) is not clever enough to perform such operation at compile time, have a look:
Class constant pool
Suddenly we ended up with two constant strings pointing to two different constant texts:
const #2 = String #44; // abc
const #3 = String #45; // ?abc
const #44 = Asciz abc;
const #45 = Asciz ?abc;
Byte code
ldc #2; //String abc
astore_1 //one
ldc #3; //String ?abc
iconst_1
invokevirtual #4; //Method String.substring:(I)Ljava/lang/String;
astore_2 //two
The fist string is constructed as usual. The second is created by first loading the constant "?abc"
string and then calling substring(1)
on it.
Output
No surprise here - we have two different strings, pointing to two different char[]
texts in memory:
27379847
7615385
8388097
16585653
Well, the texts aren't really different, equals()
method will still yield true
. We have two unnecessary copies of the same text.
Now we should run two exercises. First, try running:
two = two.intern();
before printing hash codes. Not only both one
and two
point to the same text, but they are the same reference!
11108810
11108810
15184449
15184449
This means both one.equals(two)
and one == two
tests will pass. Also we saved some memory because "abc"
text appears only once in memory (the second copy will be garbage collected).
The second exercise is slightly different, check out this:
String one = "abc";
String two = "abc".substring(1);
Obviously one
and two
are two different objects, pointing to two different texts. But how come the output suggests that they both point to the same char[]
array?!?
23583040
23583040
11108810
8918249
I'll leave the answer to you. It'll teach you how substring()
works, what are the advantages of such approach and when it can lead to big troubles.
What happens to Strings inside String[] after String[] is garbage collected?
String literals have references in String Literal Pool and are not eligible of garbage collection, ever.
Actually, that is not strictly correct ... see below.
Will the 3 string objects still be on heap referenced from pool? or they will be eligible for garbage collection along with array object.
They will not be referenced "from the pool". The references in the pool are (in effect) weak references.
They will not be eligible for garbage collection.
What is actually going to happen is that the String
objects (in the string pool) that correspond to string literals in the source code will be referenced by the code that uses the literals; i.e. there are hidden references in hidden objects that the JVM knows about. These references are what the JVM uses when you (for example) assign the string literal to something ...
It is those hidden references that mean the weak references in the pool don't break, and the corresponding String
objects don't get garbage collected.
Now, if the code that defines the literals was dynamically loaded, and the application manages to unload the code, then the String
objects may become unreachable. If that happens, they will eventually be garbage collected,
Need to know about String, String Constant pool and String intern method
I will assume that in each example below you load and execute the code exactly once, in a new JVM each time. (I will also assume that nowhere else in your code do you use the literal "Java"
... since that would complicate things.)
1) Say if there are no strings in the String constant pool, and if i
say,String s = "Java";
Then how many objects will be created ?
One string is created and added to the pool when method is loaded.
2) Now again nothing in the pool, and i say,
String s = new String("Java");
Now how many objects will be created.
One string is created and added to the pool when method is loaded.
A second string is created by the new
when the code is run, and it is NOT added to the pool.
3) Now again nothing in the pool, and i say,
String s = new String("Java");
s.intern();
What will the intern method do ?
One string is created and added to the pool when method is loaded.
A second string is created by the new
, and it is NOT added to the pool.
The intern
call returns the first string. (You don't keep the reference ...)
4) Now again nothing in the pool, and i say,
String s = new String("Java");
String s1 = s.intern();
What will happen now?
Same as example 3. Thus, s1
will hold a reference to the String
object that represents the "Java"
string literal.
I read in SCJP5 Kathy Sierra book, that when you create a String with
new
, then 2 objects are created, one on the heap and one in the pool.
I doubt that the book said that exactly. (You are paraphrasing, and I think you have paraphrased somewhat inaccurately.)
However, your paraphrasing is roughly correct, though (an this is important!) the string object representing the literal is created and added to the pool when the code fragment is loaded1, not when it is executed.
And to address another point of confusion:
"What i actually meant was that from the answer that you gave, it seems that a String will always be added in the String constant pool."
That is incorrect. It is a false generalization.
While it is true for all 4 of the cases above, it will not be true for others. It depends on where the original string came from. In typical applications, most text data is read from a file, socket, or a user interface. When that happens, the strings are created from arrays of characters, either directly or via a library call.
Here is a simple (but unrealistic) example that shows creating a String from its component characters.
String s = new String(new char[]{'J', 'a', 'v', 'a'});
In the snippet above, only one String is created, and it is NOT in the String pool. If you wanted the resulting string to be in the string pool you need to explicitly call intern
something like this:
String s = new String(new char[]{'J', 'a', 'v', 'a'});
s = s.intern();
... which will (if necessary) create a second string in the string pool2.
1 - Apparently, in some JVMs creation and interning string literals is done lazily, so it is not possible to say with 100% certainty when it actually happens. However, it will only occur once (per class that references the literal), no matter how many times the code fragment is executed by the JVM.
2 - There is no way to new
a string into the string pool. It would actually be a violation of the JLS. The new
operation is specified by the JLS as always creating a new object.
What happens if String Pool runs out of memory?
First point first - STRING POOL Doesn't have String Literals
String Pool is a Collection of references that points to the String Objects.
When you write String = "hello" it creates that an String Object "hello" on the heap and will place an reference to this object in the String Literal Pool ( provided no Object is already there on the heap named "Hello")
Point to Note "hello" is added to the constant pool of the corresponding class. Therefore, it can be garbage collected only after the class is unloaded. So when the class is unloaded that Objects gets GC
What will happens?
String pooling is done through a process called string canonicalisation Which is a weakHashMap.This weakHashMap automatically clears out mapping when there is no other references to the keys or
values.
.ie the string will be garbage collected from the JVM.
Does it Grow in size?
NO STRING POOL DOESNOT GROW IN SIZE- It's is Compile Time Constant
How it Grow in Size ?
You need to specify -XX:StringTableSize=N, at the compile time where N is the string pool map size
At and at Last your question :
What happens if String Pool runs out of memory?
Simplest Answer : You get java.lang.OutOfMemoryError:java.lang.OutOfMemoryError: Java heap space
from java 7 onwards . While java.lang.OutOfMemoryError: PermGen space
in older version of java like 6
New String Object Creation - Allocate memory in Normal Memory and String Constant Pool? Both?
There is no such thing as a “normal memory” and “pool memory”.
All String
instances live on the heap, which is, by definition, the memory holding all Java objects. There is a string pool, which is basically a kind of hash map, containing references to instances. There is no requirement for the String
instance to be in a special memory region, to be referable by the string pool. Adding a string to the pool does not imply any memory movement.
In older JVMs, the instances created for string literals were placed into a special memory region, to accommodate the lower likelihood of being garbage collected. Since that memory region, called permanent generation had some drawbacks, this policy was abandoned and the memory region removed in Java 8. This old behavior might have created some confusion. But it never was a requirement for strings referred by the pool, to be in that memory region.
Besides that, it’s not clear where your question is aiming at. You have written code requesting Java to create two distinct String
instances and Java will do so. The reason why it does so, is, because you told it so.
If you really want to go deeper into the technical details, this is, what will happen with your code:
- First, an uninitialized
String
instance is created for yournew String(…)
request - Then, a
String
instance for your"abc"
literal is created and added to the pool (unless the pool does already contain a string of that content) - Last, the constructor for the
String
instance created by the first step is invoked, with theString
instance of the second step as argument- within the constructor, the reference to the
char[]
array will be copied
- within the constructor, the reference to the
At the end, you have two instances with the same contents, as you requested, both pointing to the same array (since Java 7u6), so the single array obviously can’t be in different memory regions for the two strings.
What's the point of creating and having an opportunity to create a String object out of string pool
If you look at it from the perspective of the new
operator it makes sense.
new
always allocates a new object. That goes for all classes across the board, no exception. It doesn't have any special case behavior for the String
class, nor any other class. It is completely class agnostic.
I don't see any need for an optimization to be added, either. Writing new String("literal")
is usually a mistake. Why bother speeding up a mistake?
Related Topics
How to Create a Topic in Kafka Through Java
Can the Jvm Recover from an Outofmemoryerror Without a Restart
How to Add Unicode in Truetype0Font on PDFbox 2.0.0
Package Conflicts with Automatic Modules in Java 9
How to Keep a Scanner from Throwing Exceptions When the Wrong Type Is Entered
Java 1.6: Creating an Array of List<T>
Docker: Combine Multiple Images
Best Way to Format Multiple 'Or' Conditions in an If Statement
Safe String to Bigdecimal Conversion
What Is the Breakdown for Java's Lambda Syntax
How to Open a File with the Default Associated Program
How to Sort Date Which Is in String Format in Java
Java Conditional Compilation: How to Prevent Code Chunks from Being Compiled
The Meaning of Noinitialcontextexception Error
Synchronizing on an Integer Value