String Concatenation Using '+' Operator

String concatenation: concat() vs + operator

No, not quite.

Firstly, there's a slight difference in semantics. If a is null, then a.concat(b) throws a NullPointerException but a+=b will treat the original value of a as if it were null. Furthermore, the concat() method only accepts String values while the + operator will silently convert the argument to a String (using the toString() method for objects). So the concat() method is more strict in what it accepts.

To look under the hood, write a simple class with a += b;

public class Concat {
String cat(String a, String b) {
a += b;
return a;
}
}

Now disassemble with javap -c (included in the Sun JDK). You should see a listing including:

java.lang.String cat(java.lang.String, java.lang.String);
Code:
0: new #2; //class java/lang/StringBuilder
3: dup
4: invokespecial #3; //Method java/lang/StringBuilder."<init>":()V
7: aload_1
8: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
11: aload_2
12: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: invokevirtual #5; //Method java/lang/StringBuilder.toString:()Ljava/lang/ String;
18: astore_1
19: aload_1
20: areturn

So, a += b is the equivalent of

a = new StringBuilder()
.append(a)
.append(b)
.toString();

The concat method should be faster. However, with more strings the StringBuilder method wins, at least in terms of performance.

The source code of String and StringBuilder (and its package-private base class) is available in src.zip of the Sun JDK. You can see that you are building up a char array (resizing as necessary) and then throwing it away when you create the final String. In practice memory allocation is surprisingly fast.

Update: As Pawel Adamski notes, performance has changed in more recent HotSpot. javac still produces exactly the same code, but the bytecode compiler cheats. Simple testing entirely fails because the entire body of code is thrown away. Summing System.identityHashCode (not String.hashCode) shows the StringBuffer code has a slight advantage. Subject to change when the next update is released, or if you use a different JVM. From @lukaseder, a list of HotSpot JVM intrinsics.

Concatenate strings using ## operators in C

To use the call to puts() in this way, the macro catstr() should be constructed to do two things:

  • stringify and concatenate lmao to the string "catting these strings\t"
  • stringify and concatenate elephant to lmao.

You can accomplish this by changing your existing macro definition from:

#define catstr(x, y) x##y

To:

#define catstr(x, y) #x#y

This essentially result in:

"catting these strings\t"#lmao#elephant

Or:

"catting these strings   lmaoelephant"  

Making it a single null terminated string, and suitable as an argument to puts():

puts("catting these strings\t" catstr(lmao, elephant));

String concatenation using operator || or format() function

There are basically 4 standard tool for concatenating strings. Simplest / cheapest first:

The concatenation operator || ...

  • returns NULL if any operand is NULL. (May or may not be desirable.)
  • is a bit faster than format() or concat().
  • allows shortest syntax for very few strings to concatenate.
  • is more picky about input types as there are multiple different || operators, and the input types need to be unambiguous for operator type resolution.
  • concatenating string-types is IMMUTABLE, which allows their safe use in indexes or other places where immutable volatility is required.

concat() ...

  • does not return NULL if one argument is NULL. (May or may not be desirable.)
  • is less picky about input types as all input is coerced to text.
  • allows shortest syntax for more than a couple of strings to concatenate.
  • has only function volatility STABLE (because it takes "any" input type and coerces the input to text, and some of these conversions depend on locale of time-related settings). So not suitable where immutable volatility is required. See:
    • CONCAT used in INDEX causes ERROR: functions in index expression must be marked IMMUTABLE

concat_ws() ("with separator") ...

  • allows shortest syntax when concatenating strings with separators.
  • only inserts a separator for not-null strings, simplifying that particular (frequent) case a lot.
  • is otherwise like concat().

format() ...

  • allows for readable, short code when concatenating variables and constants.
  • provides format specifiers to safely and conveniently quote stings and identifiers (to defend against SQL injection and syntax errors), making it the first choice for dynamic SQL. (You mention trigger functions, where a lot of dynamic SQL is used.)
  • is the most sophisticated tool. You can reuse the same input multiple times (with different quotation using different format specifiers).
  • does also not return NULL if any of the input parameters are NULL. (May or may not be desirable.)
  • also has only volatility STABLE.

Further reading:

  • Combine two columns and add into one new column
  • How to concatenate columns in a Postgres SELECT?
  • Insert text with single quotes in PostgreSQL
  • SQL syntax: Concatenating multiple columns into one

Reference is not created while using + operator to concat two strings

It is all in the documentation.

For String.concat, the javadoc states this:

If the length of the argument string is 0, then this String object is returned.

For the + operator, JLS 15.8.1 states:

The result of string concatenation is a reference to a String object that is the concatenation of the two operand strings. The characters of the left-hand operand precede the characters of the right-hand operand in the newly created string.

The String object is newly created (§12.5) unless the expression is a constant expression (§15.29).

As you can see, the results will be different for the case where the 2nd string has length zero and this is not a constant expression.

That is what happens in your example.


You also said:

But while using + operator a new reference will be created in the string pool constant.

This is not directly relevant to your question, but ... actually, no it won't be created there. It will create a reference to a regular (not interned) String object in the heap. (It would only be in the class file's constant pool ... and hence the string pool ... if it was a constant expression; see JLS 15.29)

Note that the string pool and the classfile constant pool are different things.


Can I add a couple of things:

  • You probably shouldn't be using String.concat. The + operator is more concise, and the JIT compiler should know how to optimize away the creation of unnecessary intermediate strings ... in the few cases where you might consider using concat for performance reasons.

  • It is a bad idea to exploit the fact that no new object is created so that you can use == rather than equals(Object). Your code will be fragile. Just use equals always for comparing String and the primitive wrapper types. It is simpler and safer.

In short, the fact that you are even asking this question suggests that you are going down a blind alley. Knowledge of this edge-case difference between concat and + is ... pointless ... unless you are planning to enter a quiz show for Java geeks.

String Concatenation using '+' operator

It doesn't - the C# compiler does :)

So this code:

string x = "hello";
string y = "there";
string z = "chaps";
string all = x + y + z;

actually gets compiled as:

string x = "hello";
string y = "there";
string z = "chaps";
string all = string.Concat(x, y, z);

(Gah - intervening edit removed other bits accidentally.)

The benefit of the C# compiler noticing that there are multiple string concatenations here is that you don't end up creating an intermediate string of x + y which then needs to be copied again as part of the concatenation of (x + y) and z. Instead, we get it all done in one go.

EDIT: Note that the compiler can't do anything if you concatenate in a loop. For example, this code:

string x = "";
foreach (string y in strings)
{
x += y;
}

just ends up as equivalent to:

string x = "";
foreach (string y in strings)
{
x = string.Concat(x, y);
}

... so this does generate a lot of garbage, and it's why you should use a StringBuilder for such cases. I have an article going into more details about the two which will hopefully answer further questions.

Concatenate 2 string using operator+= in class C++

In many of your constructors, you do not set length which leaves it with an indeterminate value - and reading such values makes the program have undefined behavior. So, first fix that:

#include <algorithm> // std::copy_n

// Constructor with no arguments
String::String() : data{new char[1]{'\0'}}, length{0} {}

// Constructor with one argument
String::String(const char* s) { // note: const char*
if (s == nullptr) {
data = new char[1]{'\0'};
length = 0;
} else {
length = std::strlen(s);
data = new char[length + 1];
std::copy_n(s, length + 1, data);
}
}

// Copy Constructor
String::String(const String& source) : data{new char[source.length + 1]},
length{source.length}
{
std::copy_n(source.data, length + 1, data);
}

// Move Constructor
String::String(String&& source) : String() {
std::swap(data, source.data);
std::swap(length, source.length);
}

In operator+= you are trying to use the subscript operator, String::operator[], but you haven't added such an operator so instead of s[i], use s.data[i]:

String& String::operator+=(const String& s) {
unsigned len = length + s.length;
char* str = new char[len + 1];
for (unsigned j = 0; j < length; j++) str[j] = data[j];
for (unsigned i = 0; i < s.length; i++) str[length + i] = s.data[i];
str[len] = '\0';
delete[] data; // note: delete[] - not delete
length = len;
data = str;
return *this;
}

If you want to be able to use the subscript operator on String objects, you would need to add a pair of member functions:

class String {
public:
char& operator[](size_t idx);
char operator[](size_t idx) const;
};
char& String::operator[](size_t idx) { return data[idx]; }
char String::operator[](size_t idx) const { return data[idx]; }

And for String s3 = s1 + s2; to work, you need a free operator+ overload:

String operator+(const String& lhs, const String& rhs) {
String rv(lhs);
rv += rhs;
return rv;
}

Also, to support printing a String like you try in your alternative main function, you need an operator<< overload. Example:

class String {
friend std::ostream& operator<<(std::ostream& os, const String& s) {
os.write(s.data, s.length);
return os;
}
};

Full demo

String Concatenation using concat operator (+) or String.format() method

The first won't actually create any extra strings. It will be compiled into something like:

String str = new StringBuilder("This the String1 ")
.append(str1)
.append(" merged with Sting2 ")
.append(str2)
.toString();

Given that the second form requires parsing of the format string, I wouldn't be surprised if the first one actually runs quicker.

However, unless you've got benchmarks to prove that this is really a bit of code which is running too slowly for you, you shouldn't be too worried about the efficiency. You should be more worried about readability. Which code do you find more readable?

Java String Concatenation with + operator

Addition is left associative. Taking the first case

20+30+"abc"+(10+10)
----- -------
50 +"abc"+ 20 <--- here both operands are integers with the + operator, which is addition
---------
"50abc" + 20 <--- + operator on integer and string results in concatenation
------------
"50abc20" <--- + operator on integer and string results in concatenation

In the second case:

20+30+"abc"+10+10
-----
50 +"abc"+10+10 <--- here both operands are integers with the + operator, which is addition
---------
"50abc" +10+10 <--- + operator on integer and string results in concatenation
----------
"50abc10" +10 <--- + operator on integer and string results in concatenation
------------
"50abc1010" <--- + operator on integer and string results in concatenation


Related Topics



Leave a reply



Submit