The Concatenation of Chars to Form a String Gives Different Results

The concatenation of chars to form a string gives different results

The result of the following expression

ret + str.charAt(i) + str.charAt(i); 

is the result of String concatenation. The Java language specification states

The result of string concatenation is a reference to a String object
that is the concatenation of the two operand strings. The characters
of the left-hand operand precede the characters of the right-hand
operand in the newly created string.

The result of

str.charAt(i) + str.charAt(i); 

is the result of the additive operator applied to two numeric types. The Java language specification states

The binary + operator performs addition when applied to two operands
of numeric type, producing the sum of the operands.
[...]
The type of an additive expression on numeric operands is the promoted
type of its operands.

In which case

str.charAt(i) + str.charAt(i); 

becomes an int holding the sum of the two char values. That is then concatenated to ret.


You might also want to know this about the compound assignment expression +=

A compound assignment expression of the form E1 op= E2 is equivalent
to E1 = (T) ((E1) op (E2)), where T is the type of E1, except that E1
is evaluated only once.

In other words

ret += str.charAt(i) + str.charAt(i);

is equivalent to

ret = (String) ((ret) + (str.charAt(i) + str.charAt(i)));
| ^ integer addition
|
^ string concatenation

Why does char concatenation returns int sum?

The difference is in the way the concatenations are constructed.

First: res += str.charAt(0) + str.charAt(2);

Here, the two char values are added together first. Binary numeric promotion occurs (JLS, Section 5.6.2).

Widening primitive conversion (§5.1.2) is applied to convert either or both operands as specified by the following rules:

  • If either operand is of type double, the other is converted to double.

  • Otherwise, if either operand is of type float, the other is converted to float.

  • Otherwise, if either operand is of type long, the other is converted to long.

  • Otherwise, both operands are converted to type int.

That means that the values are promoted to int, creating your 196. That is then added to str, appending "196".

Second: res = res + str.charAt(0) + str.charAt(2);

Here, the res + str.charAt(0) is performed first, and a String plus a char appends the char (via String Conversion, JLS 15.18.1, resulting in a new String.

If only one operand expression is of type String, then string conversion (§5.1.11) is performed on the other operand to produce a string at run time.

Then, the second char is appended similarly.

If you were to say

res = res + (str.charAt(0) + str.charAt(2));

then the result would be the same (appending the 196) as with +=.

How does the concatenation of a String with characters work in Java?

str.charAt(i) returns a char, adding two chars results in a char with a codepoint equal to the sum of the input codepoints. When you start with str +, the first concatenation is between a String and a char, which results in a String, followed by the second concatenation, also between a String and a char.

You can fix this a few ways, such as:

str1 += String.valueOf(str.charAt(i)) + str.charAt(i);

or

str1 += "" + str.charAt(i) + str.charAt(i);

or, as you've already discovered, and likely the most readable:

str1 = str1 + str.charAt(i) + str.charAt(i);

Concatenation of Strings and characters

You see this behavior as a result of the combination of operator precedence and string conversion.

JLS 15.18.1 states:

If only one operand expression is of type String, then string conversion (§5.1.11) is performed on the other operand to produce a string at run time.

Therefore the right hand operands in your first expression are implicitly converted to string: string = string + ((char)65) + 5;

For the second expression however string += ((char)65) + 5; the += compound assignment operator has to be considered along with +. Since += is weaker than +, the + operator is evaluated first. There we have a char and an int which results in a binary numeric promotion to int. Only then += is evaluated, but at this time the result of the expression involving the + operator has already been evaluated.

Why this java statement is generating this output?

  • Expressions are evaluated Left-To-Right.
  • Hence first two chars are evaluated and added, which results in a char (value = 205).
  • Next this char(=205) is added to a string, which results in String.

Hence the strange output.

Fix:

Use a StringBuilder instead

public static void main(String[] args) {

String str = "hell";
StringBuilder buff = new StringBuilder();
buff.append(str.charAt(0))
.append(str.charAt(1))
.append(str)
.append(str.charAt(0))
.append(str.charAt(1));
System.out.println(buff.toString());// prints 'hehellhe'
}

What is happening when I add a char and a String in Java?

You precisely named the reason why using the + operator for string concatenation can be seen as a historical design mistake. Providing a builtin concatenation operator is not wrong, but it should not have been the plus operator.

Besides the confusion about different behavior, e.g. for 'a'+'b' and ""+'a'+'b', the plus operator is normally expected to be commutative, i.e. a + b has the same result as b + a, which doesn’t hold for string concatenation. Further, the operator precedence can lead to surprises.

The behavior is precisely specified (JLS §15.18.1):

15.18.1. String Concatenation Operator +

If only one operand expression is of type String, then string conversion (§5.1.11) is performed on the other operand to produce a string at run time.

The result of string concatenation is a reference to a String object that is the concatenation of the two operand strings. The characters of the left-hand operand precede the characters of the right-hand operand in the newly created string.

This definition links to §5.1.11:

5.1.11. String Conversion

Any type may be converted to type String by string conversion.

A value x of primitive type T is first converted to a reference value as if by giving it as an argument to an appropriate class instance creation expression (§15.9):

  • If T is boolean, then use new Boolean(x).

  • If T is char, then use new Character(x).

  • If T is byte, short, or int, then use new Integer(x).

  • If T is long, then use new Long(x).

  • If T is float, then use new Float(x).

  • If T is double, then use new Double(x).

This reference value is then converted to type String by string conversion.

Now only reference values need to be considered:

  • If the reference is null, it is converted to the string "null" (four ASCII characters n, u, l, l).

  • Otherwise, the conversion is performed as if by an invocation of the toString method of the referenced object with no arguments; but if the result of invoking the toString method is null, then the string "null" is used instead.

(The spec’s formatting truly is "null" rather than "null")

So the behavior of String foo = 'a' + "bee"; is specified to be as-if you’ve written String foo = new Character('a').toString() + "bee";

But the cited §15.18.1 continues with:

The String object is newly created (§12.5) unless the expression is a constant expression (§15.28).

An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.

For primitive types, an implementation may also optimize away the creation of a wrapper object by converting directly from a primitive type to a string.

So for your specific example, 'a' + "bee", the actual behavior of

String foo = 'a' + "bee";

will be

String foo = "abee";

without any additional operations at runtime, because it is a compile-time constant.

If one of the operands is not a compile-time constant, like

char c = 'a';
String foo = c + "bee";

The optimized variant, as used by most if not all compilers from Java 5 to Java 8 (inclusive), is

char c = 'a';
String foo = new StringBuilder().append(c).append("bee").toString();

See also this answer. Starting with Java 9, a different approach will be used.

The resulting behavior will always be like specified.

Order of empty string in string conversion by concatenating empty string

The + operator is left-associative, which means that it is grouped from left-to-right.

str = ch + ch + "";

This is equivalent to

str = (ch + ch) + "";
// = ('A' + 'A') + "";
// = 130 + "";
// = "130";

not

str = ch + (ch + "");
// = 'A' + ('A' + "");
// = 'A' + "A";
// = "AA";

char + String and String + char both result in a String. But char + char returns an int. Do you see now why a second + ch doesn't work?



Related Topics



Leave a reply



Submit