Immutability of Strings in Java

Immutability of Strings in Java

str is not an object, it's a reference to an object. "Hello" and "Help!" are two distinct String objects. Thus, str points to a string. You can change what it points to, but not that which it points at.

Take this code, for example:

String s1 = "Hello";
String s2 = s1;
// s1 and s2 now point at the same string - "Hello"

Now, there is nothing1 we could do to s1 that would affect the value of s2. They refer to the same object - the string "Hello" - but that object is immutable and thus cannot be altered.

If we do something like this:

s1 = "Help!";
System.out.println(s2); // still prints "Hello"

Here we see the difference between mutating an object, and changing a reference. s2 still points to the same object as we initially set s1 to point to. Setting s1 to "Help!" only changes the reference, while the String object it originally referred to remains unchanged.

If strings were mutable, we could do something like this:

String s1 = "Hello";
String s2 = s1;
s1.setCharAt(1, 'a'); // Fictional method that sets character at a given pos in string
System.out.println(s2); // Prints "Hallo"

Edit to respond to OP's edit:

If you look at the source code for String.replace(char,char) (also available in src.zip in your JDK installation directory -- a pro tip is to look there whenever you wonder how something really works) you can see that what it does is the following:

  • If there is one or more occurrences of oldChar in the current string, make a copy of the current string where all occurrences of oldChar are replaced with newChar.
  • If the oldChar is not present in the current string, return the current string.

So yes, "Mississippi".replace('i', '!') creates a new String object. Again, the following holds:

String s1 = "Mississippi";
String s2 = s1;
s1 = s1.replace('i', '!');
System.out.println(s1); // Prints "M!ss!ss!pp!"
System.out.println(s2); // Prints "Mississippi"
System.out.println(s1 == s2); // Prints "false" as s1 and s2 are two different objects

Your homework for now is to see what the above code does if you change s1 = s1.replace('i', '!'); to s1 = s1.replace('Q', '!'); :)


1 Actually, it is possible to mutate strings (and other immutable objects). It requires reflection and is very, very dangerous and should never ever be used unless you're actually interested in destroying the program.

What is difference between mutable and immutable String in java

Case 1:

String str = "Good";
str = str + " Morning";

In the above code you create 3 String Objects.

  1. "Good" it goes into the String Pool.
  2. " Morning" it goes into the String Pool as well.
  3. "Good Morning" created by concatenating "Good" and " Morning". This guy goes on the Heap.

Note: Strings are always immutable. There is no, such thing as a mutable String. str is just a reference which eventually points to "Good Morning". You are actually, not working on 1 object. you have 3 distinct String Objects.


Case 2:

StringBuffer str = new StringBuffer("Good"); 
str.append(" Morning");

StringBuffer contains an array of characters. It is not same as a String.
The above code adds characters to the existing array. Effectively, StringBuffer is mutable, its String representation isn't.

Is new String() immutable as well?

new String() is an expression that produces a String ... and a String is immutable, no matter how it is produced.

(Asking if new String() is mutable or not is nonsensical. It is program code, not a value. But I take it that that is not what you really meant.)


If I create a string object as String c = ""; is an empty entry created in the pool?

Yes; that is, an entry is created for the empty string. There is nothing special about an empty String.

(To be pedantic, the pool entry for "" gets created long before your code is executed. In fact, it is created when your code is loaded ... or possibly even earlier than that.)


So, I was wanted to know whether the new heap object is immutable as well, ...

Yes it is. But the immutability is a fundamental property of String objects. All String objects.

You see, the String API simply does not provide any methods for changing a String. So (apart from some dangerous and foolish1 tricks using reflection), you can't mutate a String.

and if so what was the purpose?.

The primary reason that Java String is designed as an immutable class is simplicity. It makes it easier to write correct programs, and read / reason about other people's code if the core string class provides an immutable interface.

An important second reason is that the immutability of String has fundamental implications for the Java security model. But I don't think this was a driver in the original language design ... in Java 1.0 and earlier.

Going by the answer, I gather that other references to the same variable is one of the reasons. Please let me know if I am right in understanding this.

No. It is more fundamental than that. Simply, all String objects are immutable. There is no complicated special case reasoning required to understand this. It just >>is<<.

For the record, if you want a mutable "string-like" object in Java, you can use StringBuilder or StringBuffer. But these are different types to String.


1 - The reason these tricks are (IMO) dangerous and foolish is that they affect the values of strings that are potentially shared by other parts of your application via the string pool. This can cause chaos ... in ways that the next guy maintaining your code has little chance of tracking down.

Java - String immutability and Array mutability

a and b are references to the same Array (there is a single Array object in memory.)

a ---> ["a", "b", "c"] <---- b

You are changing this array value with this line :

a[0] = "Z"

So you know have this in memory :

a ---> ["Z", "b", "c"] <---- b

For the Strings, it's different.

At first, you have two variables pointing the same value :

String s1 = "hello";
String s2 = s1;

You have this in memory :

s1 ---> "hello" <---- s2

But then, you assign s1 to a new value with this code :

s1 = "world";

The variable s2 still points to the string "hello". There are now 2 string objects in memory.

s1 ---> "world" 
s2 ---> "hello"

In Java, Strings are immutable, but arrays are mutable.
See also this question.

Note that if you define a class of yours, the behavior will be closer to the Array.

public class Foo() {
private int _bar = 0;
public void setBar(int bar) {
this._bar = bar
}
public void getBar() {
return this._bar;
}
}


Foo f1 = new Foo();
Foo f1 = f2;

You have this :

f1 ----> Foo [ _bar = 0 ] <---- f2

You can work on the object :

f1.setBar(1)
f2.setBar(2) // This is the same object

This makes something a bit "like" the array :

f1 ----> Foo [ _bar = 2 ] <---- f2

But if you assign f2 to another value, you get this :

f2 = new Foo();

Which creates a new value in memory, but still keeps the first reference
pointing to the first object.

f1 ----> Foo [ _bar = 2 ] 
f2 ----> Foo [ _bar = 0 ]

String is immutable. What exactly is the meaning?

Before proceeding further with the fuss of immutability, let's just take a look into the String class and its functionality a little before coming to any conclusion.

This is how String works:

String str = "knowledge";

This, as usual, creates a string containing "knowledge" and assigns it a reference str. Simple enough? Lets perform some more functions:

 String s = str;     // assigns a new reference to the same string "knowledge"

Lets see how the below statement works:

  str = str.concat(" base");

This appends a string " base" to str. But wait, how is this possible, since String objects are immutable? Well to your surprise, it is.

When the above statement is executed, the VM takes the value of String str, i.e. "knowledge" and appends " base", giving us the value "knowledge base". Now, since Strings are immutable, the VM can't assign this value to str, so it creates a new String object, gives it a value "knowledge base", and gives it a reference str.

An important point to note here is that, while the String object is immutable, its reference variable is not. So that's why, in the above example, the reference was made to refer to a newly formed String object.

At this point in the example above, we have two String objects: the first one we created with value "knowledge", pointed to by s, and the second one "knowledge base", pointed to by str. But, technically, we have three String objects, the third one being the literal "base" in the concat statement.

Important Facts about String and Memory usage

What if we didn't have another reference s to "knowledge"? We would have lost that String. However, it still would have existed, but would be considered lost due to having no references.
Look at one more example below

String s1 = "java";
s1.concat(" rules");
System.out.println("s1 refers to "+s1); // Yes, s1 still refers to "java"

What's happening:

  1. The first line is pretty straightforward: create a new String "java" and refer s1 to it.
  2. Next, the VM creates another new String "java rules", but nothing
    refers to it. So, the second String is instantly lost. We can't reach
    it.

The reference variable s1 still refers to the original String "java".

Almost every method, applied to a String object in order to modify it, creates new String object. So, where do these String objects go? Well, these exist in memory, and one of the key goals of any programming language is to make efficient use of memory.

As applications grow, it's very common for String literals to occupy large area of memory, which can even cause redundancy. So, in order to make Java more efficient, the JVM sets aside a special area of memory called the "String constant pool".

When the compiler sees a String literal, it looks for the String in the pool. If a match is found, the reference to the new literal is directed to the existing String and no new String object is created. The existing String simply has one more reference. Here comes the point of making String objects immutable:

In the String constant pool, a String object is likely to have one or many references. If several references point to same String without even knowing it, it would be bad if one of the references modified that String value. That's why String objects are immutable.

Well, now you could say, what if someone overrides the functionality of String class? That's the reason that the String class is marked final so that nobody can override the behavior of its methods.

How does Java strings being immutable increase security?

A very common practice in writing class libraries is storing the parameters passed into your API, say, in a constructor, like this:

public class MyApi {
final String myUrl;
public MyApi(String urlString) {
// Verify that urlString points to an approved server
if (!checkApprovedUrl(urlString)) throw new IllegalArgumentException();
myUrl = urlString;
}
}

Were String mutable, this would lead to a subtle exploit: an attacker would pass a good URL, wait for a few microseconds, and then set the URL to point to an attack site.

Since storing without copying is a reasonably common practice, and because strings are among the most commonly used data types, leaving strings mutable would open up many APIs that are not written yet open to a serious security problem. Making strings immutable closes this particular security hole for all APIs, including the ones that are not written yet.

Why string is called immutable in java?

But string can be modified when we use functions.

No, what you get back is a different string. Example:

String a, b, c;

a = "testing 1 2 3";
b = a.substring(0, 7); // Creates new string for `b`, does NOT modify `a`
c = a.substring(8);

System.out.println(b); // "testing"
System.out.println(c); // "1 2 3", proves that `a` was not modified when we created `b`

As you can see, the string "testing 1 2 3" was not modified by the substring call; instead, we got back a new string with just "testing".

String objects are immutable in Java because they provide no methods that modify the state of an existing String object. They only provide methods that create new String objects based on the content of existing ones.

(The above is, of course, unless you play very naughty games indeed with reflection.)

Java String Immutability storage when String object is changed

Modifying Strings

The value is not updated when running

s = "value2";

In Java, except for the primitive types, all other variables are references to objects. This means that only s is pointing to a new value.

Immutability guarantees that the state of an object cannot change after construction. In other words, there are no means to modify the content of any String object in Java. If you for instance state s = s+"a"; you have creates a new string, that somehow stores the new text.

Garbage collection

This answer already provides an in-depth answer. Below a short summary if you don't want to read the full answer, but it omits some details.

By default new String(...) objects are not interned and thus the normal rules of garbage collection apply. These are just ordinary objects.

The constant strings in your code, which are interned are typically never removed as it is likely that eventually you will refer back to these.

There is however a side-note in the answer that sometimes classes are dynamically (un)loaded, in which case the literals can be removed from the pool.


To answer your additional questions:

Will it immediately free the space from the heap after assigning the literal?

No, that would not be really efficient: the garbage collector needs to make an analysis about which objects to remove. It is possible that you shared the references to your old string with other objects, so it is not guaranteed that you can recycle the object. Furthermore there is not much wrong with storing data no longer useful, as long as you don't need to ask additional memory to the operating system (compare it with you computer, as long as you can store all your data on your hard disk drive, you don't really have to worry about useless files, from the moment you would have to buy an additional drive, you will probably try to remove some files first). The analysis requires some computational effort. In general a garbage collector only runs when it (nearly) runs out of memory. So you shouldn't worry much about memory.

Can anyone explain what what value goes where from the first statement to the second and what will happened to the memory area (heap and String Pool).

Your first string:

String s = new String("Value1");

is a reference to the heap. If you call the command, it will allocate space on the heap for the string.

Now if you call:

s = "value2";

"value2" is an element of the String Pool, it will remain there until your program ends.

Since you don't have a reference to your old string (value1), anymore. That object is a candidate for collection. If the garbage collector later walks by, it will remove the object from the heap and mark the space as free.

What makes String immutable?

String is immutable because there is no way to change the contents of its internal representation, no way to gain a reference to its internal representation, and it cannot be subclassed. This means that no matter what you do to a String, you cannot change its contents.

What methods like toUpperCase() actually do is return a new String, which has the same contents as the old String. however, the new String is a separate object. references to the initial string will not now point to the new one.

If you wanted to print out the results of toUpperCase() then do so like the following:

String s="Example";
String upperCase = s.toUpperCase();
System.out.println(upperCase);

EDIT: references vs. objects

A variable in Java is a reference to an object. a recerence is essentially a pointer that tells Java "your object is here", referring to some memory location. However the reference and the object are not the same thing. Lets look at the following as an example

String s = "Test";
String a = s;
String s = "another test"

What I have done here is declared a reference called s, and set its pointer to a newly created object - in this case a String with value "Test". I then declare another reference a and set it to reference the same object that s references. Note, that there are two references but only one object.

From there I change the reference s to point to a different object - in this case a String with value "another test". This creates a new object, despite the reference being reused. However, the reference a still points to the original object, which has remained unchanged. if we do println(a) it will still print "Test". This is because the object itself is immutable. Anything that points to it will always return the same value. I can make the references point somewhere else, but cannot change the object.

Think about the variables like entries in a phone book. If I put in an entry for someone and say his last name is "BOB", and someone later comes along and crosses that out and writes "ALICE", this does not suddenly change Bob's gender, or his name. BOB does not spontaneously turn into ALICE because someone rewrote my phone book. It merely changes the value in my phone book, not the actual person. I hope this metaphor helps.



Related Topics



Leave a reply



Submit