Why .Net String Is Immutable

Why .NET String is immutable?

  1. Instances of immutable types are inherently thread-safe, since no thread can modify it, the risk of a thread modifying it in a way that interferes with another is removed (the reference itself is a different matter).
  2. Similarly, the fact that aliasing can't produce changes (if x and y both refer to the same object a change to x entails a change to y) allows for considerable compiler optimisations.
  3. Memory-saving optimisations are also possible. Interning and atomising being the most obvious examples, though we can do other versions of the same principle. I once produced a memory saving of about half a GB by comparing immutable objects and replacing references to duplicates so that they all pointed to the same instance (time-consuming, but a minute's extra start-up to save a massive amount of memory was a performance win in the case in question). With mutable objects that can't be done.
  4. No side-effects can come from passing an immutable type as a method to a parameter unless it is out or ref (since that changes the reference, not the object). A programmer therefore knows that if string x = "abc" at the start of a method, and that doesn't change in the body of the method, then x == "abc" at the end of the method.
  5. Conceptually, the semantics are more like value types; in particular equality is based on state rather than identity. This means that "abc" == "ab" + "c". While this doesn't require immutability, the fact that a reference to such a string will always equal "abc" throughout its lifetime (which does require immutability) makes uses as keys where maintaining equality to previous values is vital, much easier to ensure correctness of (strings are indeed commonly used as keys).
  6. Conceptually, it can make more sense to be immutable. If we add a month onto Christmas, we haven't changed Christmas, we have produced a new date in late January. It makes sense therefore that Christmas.AddMonths(1) produces a new DateTime rather than changing a mutable one. (Another example, if I as a mutable object change my name, what has changed is which name I am using, "Jon" remains immutable and other Jons will be unaffected.
  7. Copying is fast and simple, to create a clone just return this. Since the copy can't be changed anyway, pretending something is its own copy is safe.
  8. [Edit, I'd forgotten this one]. Internal state can be safely shared between objects. For example, if you were implementing list which was backed by an array, a start index and a count, then the most expensive part of creating a sub-range would be copying the objects. However, if it was immutable then the sub-range object could reference the same array, with only the start index and count having to change, with a very considerable change to construction time.

In all, for objects which don't have undergoing change as part of their purpose, there can be many advantages in being immutable. The main disadvantage is in requiring extra constructions, though even here it's often overstated (remember, you have to do several appends before StringBuilder becomes more efficient than the equivalent series of concatenations, with their inherent construction).

It would be a disadvantage if mutability was part of the purpose of an object (who'd want to be modeled by an Employee object whose salary could never ever change) though sometimes even then it can be useful (in a many web and other stateless applications, code doing read operations is separate from that doing updates, and using different objects may be natural - I wouldn't make an object immutable and then force that pattern, but if I already had that pattern I might make my "read" objects immutable for the performance and correctness-guarantee gain).

Copy-on-write is a middle ground. Here the "real" class holds a reference to a "state" class. State classes are shared on copy operations, but if you change the state, a new copy of the state class is created. This is more often used with C++ than C#, which is why it's std:string enjoys some, but not all, of the advantages of immutable types, while remaining mutable.

Why does C# string have a title Immutable if to compare it with F# analogue?

You are confusing the immutability of the variable in F# with the immutability of the object in C#.

In C#, you can change which object representing a string a variable is pointing at. But you can't change anything about that object, hence the object is immutable.

In F#, you cannot change which value a variable has, so the variable is also immutable.

if strings are immutable in c#, how come I am doing this?

Use reflector to look at the ILM code and you will see exactly what is going on. Although your code logically appends new contents onto the end of the string, behind the scenes the compiler is creating ILM code that is creating a new string for each assignment.

The picture gets a little muddier if you concatenate literal strings in a single statement like this:

str = "a" + "b" + "c" ...

In this case the compiler is usually smart enough to not create all the extra strings (and thus work for the Garbage collector and will translate it for you to ILM code equivalent to:

str = "abc"

That said, doing it on separate lines like that might not trigger that optimization.

Immutable Strings

Firstly, strings are immutable and that's that.

var string1 = "string";
var string2 = string1;
string2 = "string2";

Console.WriteLine(string1);
Console.WriteLine(string2);

Output


string
string2

Secondly, why do you really want an mutable string? Here are many reasons why strings "are" immutable. see Why .NET String is immutable?

Lastly, if you really want an immutable string you can create an instance of StringBuilder You get mutability, however it will reallocate its internal buffer every-time it needs to, or you can roll your own fancy pants class.

why string is immutable and stringbuilder is not mutable

You're not "changing" the original string - you're creating a new string. By immutable it means that things like this:

a.ToUpper();

do not modify a - they return a new string, so with

b = a.ToUpper();

b and a are different strings.

In your example,

string a = "hello";
a="hello"+"world";
Console.WriteLine(a);

a is a variable that references a new string after the second line is executed.

string is immutable and stringbuilder is mutable

You cannot edit the value of a string, each method of the string object returns a new string instead of altering the original. StringBuilder on the other hand can alter it's content (adding new strings for example).

string original = "the brown fox jumped over the lazy dog";
string altered = original.Insert(original.IndexOf("fox"), "cat");
// altered = the brown cat jumped over the lazy dog

You cannot change the content of the original string unless you create a new string, or re-reference the instance to another string object.

Why only string is immutable & not other data types

Many other languages provide similar design for strings: Java with StringBuffer and StringBuilder, Scala with StringBuilder, Python with MutableString though there are other, beter solutions in Python. In C++ strings are mutable, so no need for a builder.

The reason why builder exist for strings is:

  1. Many languages define string as immutable (any change requires a new object in memory)
  2. Strings tend to be large, much larger than ints
  3. [1] and [2] combined cause inefficiency

The reason why builder doesn't exist for int:

  1. It is simple data structure by itself
  2. Most CPU have optimised instructions to deal with simple numbers (add, take away, etc)
  3. Most CPU would efficiently process [2] instructions in just one or a few cycles, using registers or fast CPU cache
  4. [2] and [3] combined remove the need for optimisation
  5. There is little need to mutate an int per se, however, if you need to, you can use BitConverter or binary shift operations

Why can't strings be mutable in Java and .NET?

According to Effective Java, chapter 4, page 73, 2nd edition:

"There are many good reasons for this: Immutable classes are easier to
design, implement, and use than mutable classes. They are less prone
to error and are more secure.

[...]

"Immutable objects are simple. An immutable object can be in
exactly one state, the state in which it was created. If you make sure
that all constructors establish class invariants, then it is
guaranteed that these invariants will remain true for all time, with
no effort on your part.

[...]

Immutable objects are inherently thread-safe; they require no synchronization. They cannot be corrupted by multiple threads
accessing them concurrently. This is far and away the easiest approach
to achieving thread safety. In fact, no thread can ever observe any
effect of another thread on an immutable object. Therefore,
immutable objects can be shared freely

[...]

Other small points from the same chapter:

Not only can you share immutable objects, but you can share their internals.

[...]

Immutable objects make great building blocks for other objects, whether mutable or immutable.

[...]

The only real disadvantage of immutable classes is that they require a separate object for each distinct value.



Related Topics



Leave a reply



Submit