Java String Concatenation with + Operator

String concatenation: concat() vs + operator

No, not quite.

Firstly, there's a slight difference in semantics. If a is null, then a.concat(b) throws a NullPointerException but a+=b will treat the original value of a as if it were null. Furthermore, the concat() method only accepts String values while the + operator will silently convert the argument to a String (using the toString() method for objects). So the concat() method is more strict in what it accepts.

To look under the hood, write a simple class with a += b;

public class Concat {
String cat(String a, String b) {
a += b;
return a;
}
}

Now disassemble with javap -c (included in the Sun JDK). You should see a listing including:

java.lang.String cat(java.lang.String, java.lang.String);
Code:
0: new #2; //class java/lang/StringBuilder
3: dup
4: invokespecial #3; //Method java/lang/StringBuilder."<init>":()V
7: aload_1
8: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
11: aload_2
12: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: invokevirtual #5; //Method java/lang/StringBuilder.toString:()Ljava/lang/ String;
18: astore_1
19: aload_1
20: areturn

So, a += b is the equivalent of

a = new StringBuilder()
.append(a)
.append(b)
.toString();

The concat method should be faster. However, with more strings the StringBuilder method wins, at least in terms of performance.

The source code of String and StringBuilder (and its package-private base class) is available in src.zip of the Sun JDK. You can see that you are building up a char array (resizing as necessary) and then throwing it away when you create the final String. In practice memory allocation is surprisingly fast.

Update: As Pawel Adamski notes, performance has changed in more recent HotSpot. javac still produces exactly the same code, but the bytecode compiler cheats. Simple testing entirely fails because the entire body of code is thrown away. Summing System.identityHashCode (not String.hashCode) shows the StringBuffer code has a slight advantage. Subject to change when the next update is released, or if you use a different JVM. From @lukaseder, a list of HotSpot JVM intrinsics.

String concatenation with the + symbol

The rule

“do not concatenate Strings with + !!!“

is wrong, because it is incomplete and therefore misleading.

The rule is

do not concatenate Strings with + in a loop

and that rule still holds. The original rule was never meant to be applied outside of loops!

A simple loop

String s = "";
for (int i = 0; i < 10000; i++) { s += i; }
System.out.println(s);

is still much still much slower than

StringBuilder sb = new StringBuilder();
for (int i = 0; i < 10000; i++) { sb.append(i); }
System.out.println(sb.toString());

because the Java compiler has to translate the first loop into

String s = "";
for (int i = 0; i < 1000; i++) { s = new StringBuilder(s).append(i).toString(); }
System.out.println(s);

Also the claim

Today the JVM compiles the + symbol into a string builder (in most cases).

is misleading at least, because this translation was already done with Java 1.0 (ok, not with StringBuilder but with StringBuffer, because StringBuilder was only added with Java5).


One could also argue that the claim

Today the JVM compiles the + symbol into a string builder (in most cases).

is simply wrong, because the compilation is not done by the JVM. It is done by the Java Compiler.


For the question: when does the Java compiler use StringBuilder.append() and when does it use some other mechanism?

The source code of the Java compiler (version 1.8) contains two places where String concationation through the + operator is handled.

  • the first place is String constant folding (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/com/sun/tools/javac/comp/ConstFold.java?av=f#314). In this case the compiler can calculate the resulting string and works with the resulting string.
  • the second place is where the compiler creates the code for assignment operations (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/com/sun/tools/javac/jvm/Gen.java?av=f#2056). In this case the compiler always emits code to create a StringBuilder

The conclusion is that for the Java compiler from the OpenJDK (which means the compiler distributed by Oracle) the phrase in most cases means always. (Though this could change with Java 9, or it could be that another Java compiler like the one that is included within Eclipse uses some other mechanism).

Java String Concatenation with + operator

Addition is left associative. Taking the first case

20+30+"abc"+(10+10)
----- -------
50 +"abc"+ 20 <--- here both operands are integers with the + operator, which is addition
---------
"50abc" + 20 <--- + operator on integer and string results in concatenation
------------
"50abc20" <--- + operator on integer and string results in concatenation

In the second case:

20+30+"abc"+10+10
-----
50 +"abc"+10+10 <--- here both operands are integers with the + operator, which is addition
---------
"50abc" +10+10 <--- + operator on integer and string results in concatenation
----------
"50abc10" +10 <--- + operator on integer and string results in concatenation
------------
"50abc1010" <--- + operator on integer and string results in concatenation

How do I concatenate two strings in Java?

You can concatenate Strings using the + operator:

System.out.println("Your number is " + theNumber + "!");

theNumber is implicitly converted to the String "42".

Concatenation operator (+) vs. concat()

The concat method always produces a new String with the result of concatenation.

The plus operator is backed by StringBuilder creation, appending all String values you need and further toString() calling on it.

So, if you need to concatenate two values, concat() will be better choice. If you need to concatenate 100 values, you should use the plus operator or explicitly use StringBuilder (e.g. in case of appending in a cycle).

Where is the new Object of String created when we concat using + operator

First of all String s = new String("abs"); It will create two objects, one object in the pool area and another one in the non-pool area because you are using new and as well as a string literal as a parameter.

String str1 = "Hello";
String str2 = "World";
String str3 = new String("HelloWorld");
String str4 = str1 + str2;

Till now you have five String objects, four in String Constant Pool and one in Heap. So your str4 is a new object altogether inside the String Pool,
Please check the below code also,

 String str5="HelloWorld"; //This line will create one more String Constant Pool object because we are using the variable name as str5.
String str6="HelloWorld";////This line will not create any object, this will refer the same object str5.

For test

System.out.println(str3==str4); //false
System.out.println(str4==str5);//false
System.out.println(str5==str6);//true

Java '+' operator between Arithmetic Add & String concatenation?

This is basic operator precedence, combined with String concatenation vs numerical addition.

Quoting:

If only one operand expression is of type String, then string
conversion (§5.1.11) is performed on the other operand to produce a
string at run time.

The result of string concatenation is a reference to a String object
that is the concatenation of the two operand strings. The characters
of the left-hand operand precede the characters of the right-hand
operand in the newly created string.

The String object is newly created (§12.5) unless the expression is a
constant expression (§15.28).

An implementation may choose to perform conversion and concatenation
in one step to avoid creating and then discarding an intermediate
String object. To increase the performance of repeated string
concatenation, a Java compiler may use the StringBuffer class or a
similar technique to reduce the number of intermediate String objects
that are created by evaluation of an expression.

For primitive types, an implementation may also optimize away the
creation of a wrapper object by converting directly from a primitive
type to a string.

See language specifications here.

TL;DR

  • Operator precedence for + is from left to right
  • If any operand in a binary operation is a String, the result is a String
  • If both operands are numbers, the result is a number

String Concatenation using concat operator (+) or String.format() method

The first won't actually create any extra strings. It will be compiled into something like:

String str = new StringBuilder("This the String1 ")
.append(str1)
.append(" merged with Sting2 ")
.append(str2)
.toString();

Given that the second form requires parsing of the format string, I wouldn't be surprised if the first one actually runs quicker.

However, unless you've got benchmarks to prove that this is really a bit of code which is running too slowly for you, you shouldn't be too worried about the efficiency. You should be more worried about readability. Which code do you find more readable?

Why is possible to concatenate Char and String in Java?

Java's '+' operator also serves as a concatenation operator. It can concatenate primitives and objects and would return you a string which is its result.

The following explanation assumes that you are familiar with Java's wrapper classes. In case you are not familiar with them, please give it a read.

Java's '+' operator converts all the primitive data types used in the statement to their equivalent Wrapper classes and invokes toString() method on those instances and uses that result, which is a string in the expression.

Ex: In Java, a statement like System.out.println( 3 + " Four " + 'C' ); ends up creating a String with the content "3 Four C".

In the above statement, 3 is a primitive int variable. " Four " is a String object and 'C' is a primitive char variable.

During '+' concat operation,
3 gets converted to its corresponding Wrapper class -> Integer. And then toString() method is called on it. Output is String 3 i.e., "3"
" Four " is already a String and needs no further processing.
'C' gets converted to Character wrapper class and toString() method results in returning the String "C".

So finally, these three are added so that you get "3 Four C".

To Sum up:

  1. If a primitive is used in '+' operator, it would be converted to its Wrapper class and then toString() method is called on it and the result would be used for appending.
  2. If an object other than String is used, its toString() method would be called and its result would be used for appending.
  3. If a String is called, well, there is not much to do and the string itself gets used for appending.


Related Topics



Leave a reply



Submit