What Is the Purpose of the Expression "New String(...)" in Java

What is the purpose of the expression new String(...) in Java?

The one place where you may think you want new String(String) is to force a distinct copy of the internal character array, as in

small=new String(huge.substring(10,20))

However, this behavior is unfortunately undocumented and implementation dependent.

I have been burned by this when reading large files (some up to 20 MiB) into a String and carving it into lines after the fact. I ended up with all the strings for the lines referencing the char[] consisting of entire file. Unfortunately, that unintentionally kept a reference to the entire array for the few lines I held on to for a longer time than processing the file - I was forced to use new String() to work around it, since processing 20,000 files very quickly consumed huge amounts of RAM.

The only implementation agnostic way to do this is:

small=new String(huge.substring(10,20).toCharArray());

This unfortunately must copy the array twice, once for toCharArray() and once in the String constructor.

There needs to be a documented way to get a new String by copying the chars of an existing one; or the documentation of String(String) needs to be improved to make it more explicit (there is an implication there, but it's rather vague and open to interpretation).

Pitfall of Assuming what the Doc Doesn't State

In response to the comments, which keep coming in, observe what the Apache Harmony implementation of new String() was:

public String(String string) {
value = string.value;
offset = string.offset;
count = string.count;
}

That's right, no copy of the underlying array there. And yet, it still conforms to the (Java 7) String documentation, in that it:

Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string. Unless an explicit copy of original is needed, use of this constructor is unnecessary since Strings are immutable.

The salient piece being "copy of the argument string"; it does not say "copy of the argument string and the underlying character array supporting the string".

Be careful that you program to the documentation and not one implementation.

Why to create a String object using new

The basic difference between them is memory allocation.

First option i.e

String s1 = "hello";

When you use this s1 is called as a string literal and memory for s1 is allocated at compile time.

But in 2nd case

String s2 = new String("hello");

In this case s2 is called as an object of String representing hello

When you tries to create two string literal using the first case, only one memory is referenced by those two literals. I mean String literals are working with a concept of string pool. when you create a 2nd string literal with same content, instead of allocating a new space compiler will return the same reference. Hence you will get true when you compare those two literals using == operator.

But in the 2nd case each time JVM will create a new object for each. and you have to compare their contents using equals() method but not with == operator.

If you want to create a new string object using 2nd case and also you don't want a new object, then you can use intern() method to get the same object.

String s = "hello";
String s1 = new String("hello").intern();
System.out.println(s == s1);

In this case instead of creating a new object, JVM will return the same reference s. So the output will be true

Does use of new String(hello) is completely useless over simple hello, when it is indirectly pointing to hello?

There is an interesting statement in Joshua Bloch’s “Effective Java”, 2nd edition, chapter 4, item 15:

A consequence of the fact that immutable objects can be shared freely is that
you never have to make defensive copies (Item 39). In fact, you never have to
make any copies at all because the copies would be forever equivalent to the originals.
Therefore, you need not and should not provide a clone method or copy
constructor
(Item 11) on an immutable class. This was not well understood in the
early days of the Java platform, so the String class does have a copy constructor,
but it should rarely, if ever, be used (Item 5).

(page 76 in my copy)

I think, Joshua Bloch can be seen as an authoritative source, especially as James Gosling, one of the Java inventors, has been cited saying, “I sure wish I had this book ten years ago…” (referring to the 1st edition from 2001).


So the existence of the String(String) constructor can be seen as a design mistake, much as the parameterless String() constructor. Note also the presence of the factory methods String.valueOf(char[])/ String.valueOf(char[],int,int) and String.copyValueOf(char[])/ String.copyValueOf(char[],int,int), whose naming suggests a fundamental difference that simply isn’t there. The immutable nature of String mandates that all variants create a defensive copy of the provided array, to protect against subsequent modifications. So the behavior is exactly the same (the documentation tells this explicitly), whether you use valueOf or copyValueOf.


That said, there are some practical use cases, though not necessarily being within original intentions. Some of them are described in the answers to this question. As the new operation guarantees to produce a new instance, it might be useful for any subsequent operation relying on a distinct identity, e.g. synchronizing on that instance (not that this was a good idea) or trying to recognize that instance via identity comparison to be sure that it doesn’t originate from an external source. E.g., you might want to distinguish between a property’s default value and a value that has been explicitly set. This, however, is of limited use as other code might not guaranty to maintain the object identity in its operations, even if the string contents doesn’t change. Or it might remember your special instance and reuse it, once it encountered the string.

Before Java 7, update 6, String had an offset and length field, allowing a cheap substring operation, referring to a range within the original array, without copying. This led to the scenario, that a (conceptually) small string could hold a reference to a rather large array, preventing its garbage collection. For the reference implementation (that shipped by Sun/later Oracle), recreating the string via the String(String) constructor produced a String with a fresh copy of the array, occupying only as much memory as needed. So this was a use case incorporating an implementation specific fix to an implementation specific problem…

Current Java releases do not maintain these offset and length fields, implying a potentially more expensive substring operation, but no copying behavior in the String(String) constructor anymore. This is the version, whose source code you have cited in the question. The older version can be found in this answer.

Strings [= new String vs = ]

From the javadoc :

Initializes a newly created String object so that it represents the
same sequence of characters as the argument; in other words, the newly
created string is a copy of the argument string. Unless an explicit
copy of original is needed, use of this constructor is unnecessary
since Strings are immutable.

So no, you have no reason not to use the simple literal.

Simply do

String s1 = "Stackoverflow";

Historically, this constructor was mainly used to get a lighter copy of a string obtained by splitting a bigger one (see this question). Now, There's no normal reason to use it.

String s = new String(xyz). How many objects has been made after this line of code execute?

THERE ARE ERRORS BELOW DEPENDING ON THE JVM/JRE THAT YOU USE. IT IS BETTER TO NOT WORRY ABOUT THINGS LIKE THIS ANYWAYS. SEE COMMENTS SECTION FOR ANY CORRECTIONS/CONCERNS.

First, this question really asks about this addressed here:
Is String Literal Pool a collection of references to the String Object, Or a collection of Objects

So, that is a guide for everyone on this matter.

...

Given this line of code: String s = new String(“xyz”)

There are two ways of looking at this:

(1) What happens when the line of code executes -- the literal moment it runs in the program?

(2) What is the net effect of how many Objects are created by the statement?

Answer:

1) After this executes, one additional object is created.

a) The "xyz" String is created and interned when the JVM loads the class that this line of code is contained in.

  • If an "xyz" is already in the intern pool from some other code, then the literal might produce no new String object.

b) When new String s is created, the internal char[] is a copy of the interned"xyz" string.

c) That means, when the line executes, there is only one additional object created.

The fact is the "xyz" object will have been created as soon as the class loaded and before this code section was ever run.

...next scenario ...

2) There are three objects created by the code (including the interned "a")

String s1 = "a";
String s2 = "a";
String s3 = new String("a");

a) s1 and s2 are just referenced,not objects, and they point to the same String in memory.

b) The "a" is interned and is a compound object: one char[] object and the String object itself. It consisting of two objects in memory.

c) s3, new String("a") produces one more object. The new String("a") does not copy the char[] of "a", it only references it internally. Here is the method signature:

public String2(String original) {
this.value = original.value;
this.hash = original.hash;
}

One interned String ("a") equals 2 Objects. And one new String("a") equals one more object. Net effect from code is three objects.

Java String creation and String pool

Regardless of where you are using, all string literals saves in String pool. So the answer is YES.

String hello = new String("Hello");
>--------< goes to pool.

But the thing is that the h2 won't refer from that h :)

How do you use a variable in a regular expression?

Instead of using the /regex\d/g syntax, you can construct a new RegExp object:

var replace = "regex\\d";
var re = new RegExp(replace,"g");

You can dynamically create regex objects this way. Then you will do:

"mystring1".replace(re, "newstring");


Related Topics



Leave a reply



Submit