Is String.Format as Efficient as Stringbuilder

Is String.Format as efficient as StringBuilder

NOTE: This answer was written when .NET 2.0 was the current version. This may no longer apply to later versions.

String.Format uses a StringBuilder internally:

public static string Format(IFormatProvider provider, string format, params object[] args)
{
if ((format == null) || (args == null))
{
throw new ArgumentNullException((format == null) ? "format" : "args");
}

StringBuilder builder = new StringBuilder(format.Length + (args.Length * 8));
builder.AppendFormat(provider, format, args);
return builder.ToString();
}

The above code is a snippet from mscorlib, so the question becomes "is StringBuilder.Append() faster than StringBuilder.AppendFormat()"?

Without benchmarking I'd probably say that the code sample above would run more quickly using .Append(). But it's a guess, try benchmarking and/or profiling the two to get a proper comparison.

This chap, Jerry Dixon, did some benchmarking:

http://jdixon.dotnetdevelopersjournal.com/string_concatenation_stringbuilder_and_stringformat.htm

Updated:

Sadly the link above has since died. However there's still a copy on the Way Back Machine:

http://web.archive.org/web/20090417100252/http://jdixon.dotnetdevelopersjournal.com/string_concatenation_stringbuilder_and_stringformat.htm

At the end of the day it depends whether your string formatting is going to be called repetitively, i.e. you're doing some serious text processing over 100's of megabytes of text, or whether it's being called when a user clicks a button now and again. Unless you're doing some huge batch processing job I'd stick with String.Format, it aids code readability. If you suspect a perf bottleneck then stick a profiler on your code and see where it really is.

Performance between String.format and StringBuilder

After doing a little test with StringBuilder vs String.format I understood how much time it takes each of them to solve the concatenation. Here the snippet code and the results

Code:

String name = "stackover";
String lName = " flow";
String nick = " stackoverflow";
String email = "stackoverflow@email.com";
int phone = 123123123;

//for (int i = 0; i < 10; i++) {
long initialTime1 = System.currentTimeMillis();
String response = String.format(" - Contact {name=%s, lastName=%s, nickName=%s, email=%s, phone=%d}",
name, lName, nick, email, phone);
long finalTime1 = System.currentTimeMillis();
long totalTime1 = finalTime1 - initialTime1;
System.out.println(totalTime1 + response);

long initialTime2 = System.currentTimeMillis();
final StringBuilder sb = new StringBuilder(" - Contact {");
sb.append("name=").append(name)
.append(", lastName=").append(lName)
.append(", nickName=").append(nick)
.append(", email=").append(email)
.append(", phone=").append(phone)
.append('}');
String response2 = sb.toString();
long finalTime2 = System.currentTimeMillis();
long totalTime2 = finalTime2 - initialTime2;
System.out.println(totalTime2 + response2);
//}

After of run the code several times, I saw that String.format takes more time:

String.format: 46: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
StringBuilder: 0: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
String.format: 38: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
StringBuilder: 0: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
String.format: 51: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
StringBuilder: 0: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}

But if I run the same code inside a loop, the result change.

String.format: 43: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
StringBuilder: 0: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
String.format: 1: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
StringBuilder: 0: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
String.format: 1: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}
StringBuilder: 0: Contact {name=stackover, lastName= flow, nickName= stackoverflow, email=stackoverflow@email.com, phone=123123123}

The first time String.format runs it takes more time, after of that the time is shorter even though it does not become constant as a result of StringBuilder

As @G.Fiedler said: "String.format has to parse the format string..."

With these results it can be said that StringBuilder is more efficient thanString.format

Should I use Java's String.format() if performance is important?

I wrote a small class to test which has the better performance of the two and + comes ahead of format. by a factor of 5 to 6.
Try it your self

import java.io.*;
import java.util.Date;

public class StringTest{

public static void main( String[] args ){
int i = 0;
long prev_time = System.currentTimeMillis();
long time;

for( i = 0; i< 100000; i++){
String s = "Blah" + i + "Blah";
}
time = System.currentTimeMillis() - prev_time;

System.out.println("Time after for loop " + time);

prev_time = System.currentTimeMillis();
for( i = 0; i<100000; i++){
String s = String.format("Blah %d Blah", i);
}
time = System.currentTimeMillis() - prev_time;
System.out.println("Time after for loop " + time);

}
}

Running the above for different N shows that both behave linearly, but String.format is 5-30 times slower.

The reason is that in the current implementation String.format first parses the input with regular expressions and then fills in the parameters. Concatenation with plus, on the other hand, gets optimized by javac (not by the JIT) and uses StringBuilder.append directly.

Runtime comparison

Is string.concat as efficient as a StringBuilder?

This depends on what you want. If you are contatenating 2 string than I suppose this will be faster than StringBuilder as there is a certain overhead in instantiating StringBuilder class.
If, on the other hand, you are contatenating 100 strings than StringBuilder class will be much faster. The reason why it is faster is that it uses a buffer rather than normal string contatenation(string is immutable).

string.Concat() does not use StringBuilder internally so this assumption is wrong.

Try to look at the string.Concat overload and you might get a better idea when to use string.Concat and when to use StringBuilder. You will notice that there are couple of overloads(10 to be exact but lets look at these 5).

string.Concat(arg0)
string.Concat(arg0, arg1)
string.Concat(arg0, arg1, arg2)
string.Concat(arg0, arg1, arg2, arg3)
string.Concat(params string[])

If you make a benchmark where you use all these methods along with StringBuilder you will notice that somewhere around 4 string contatenations StringBuilder will take the lead. This is because first 4 string.Concat() methods are more optimized than the last one that simply takes how many arguments as is needed.

Interesting enough, if you would add another measuring series but this time for StringBuilder where you instantiate it with capacity than that StringBuilder would take the lead since there is an overhead when StringBuilder reaches it's current capacity ceiling.

EDIT:
I have found my benchmark results(this was for a dissertation in 2011). This was done 100,000 times for reliability purposes. Axis X is number of string used in contatenation and axis Y is the duration.

Sample Image

Is it better practice to use String.format over string Concatenation in Java?

I'd suggest that it is better practice to use String.format(). The main reason is that String.format() can be more easily localised with text loaded from resource files whereas concatenation can't be localised without producing a new executable with different code for each language.

If you plan on your app being localisable you should also get into the habit of specifying argument positions for your format tokens as well:

"Hello %1$s the time is %2$t"

This can then be localised and have the name and time tokens swapped without requiring a recompile of the executable to account for the different ordering. With argument positions you can also re-use the same argument without passing it into the function twice:

String.format("Hello %1$s, your name is %1$s and the time is %2$t", name, time)

String.format() vs string concatenation performance

The second one will be even slower (if you look at the source code of String.format() you will see why). It is just because String.format() executes much more code than the simple concatenation. And at the end of the day, both code versions create 3 instances of String. There are other reasons, not performance related, to use String.format(), as others already pointed out.

String.Format vs string + string or StringBuilder?

  • Compiler will optimize as much string concat as it can, so for example strings that are just broken up for line break purposes can usually be optimized into a single string literal.
  • Concatenation with variables will get compiled into String.Concat
  • StringBuilder can be a lot faster if you're doing several (more than 10 or so I guess) "modifications" to a string but it carries some extra overhead because it allocates more space than you need in its buffer and resizes its internal buffer when it needs to.

I personally use String.Format almost all of the time for two reasons:

  • It's a lot easier to maintain the format string than rearranging a bunch of variables.
  • String.Format takes a IFormatProvider which is passed to any IFormattable types embedded in the string (such as numeric) so that you get appropriate numeric formatting for the specified culture and overall just more control over how values are formatted.

For example, since some cultures use a comma as a decimal point you would want to ensure with either StringBuilder or String.Format that you specify CultureInfo.InvariantCulture if you wanted to ensure that numbers were formatted the way you intend.

Two more thing to note...

  • StringBuilder also has an AppendFormat function which gives you the flexibility of String.Format without requiring an unnecessary second buffer.
  • When using StringBuilder, make sure you don't defeat the purpose by concatenating parameters that you pass to Append. It's an easy one to miss.

String.format() vs + operator

If you are looking for performance only I believe that using StringBuilder/StringBuffer is the most efficient way to build strings. Even if the Java compiler is smart enough to translate most of String concatenations to StringBuilder equivalent.

If you are looking for readability the String.format thing is the much clearer I think, and this is what I use also unless I need to rely on high performance.

So if your main concern is not performance, meaning this code is not in a path that is called a lot, you may prefer to use String.format as it gives a better idea of the resulting String (like you said).

Besides, using String.format lets you use the format thing, which means you can use it for padding Strings, formatting numbers, dates, and so on, which would make the code even worse if using simple concatenation.

Edit for Chuu:

Using JAD, you can see that the following code:

public class Test {
public static void main(String[] args) {
String str = "a" + "b" + "c";
String str2 = "foo" + str + "bar" + str;
System.out.println(str2);
}
}

when decompiled will look like:

public class Test {
public static void main(String[] args) {
String str = "abc";
String str2 = new StringBuilder("foo").append(str).append("bar").append(str).toString();
System.out.println(str2);
}
}

Proof of that can also be found using the javap utility that will show you the Java bytecode under a .class file:

public static void main(java.lang.String[] args);
0 ldc <String "abc"> [16]
2 astore_1 [str]
3 new java.lang.StringBuilder [18]
6 dup
7 ldc <String "foo"> [20]
9 invokespecial java.lang.StringBuilder(java.lang.String) [22]
12 aload_1 [str]
13 invokevirtual java.lang.StringBuilder.append(java.lang.String) : java.lang.StringBuilder [25]
16 ldc <String "bar"> [29]
18 invokevirtual java.lang.StringBuilder.append(java.lang.String) : java.lang.StringBuilder [25]
21 aload_1 [str]
22 invokevirtual java.lang.StringBuilder.append(java.lang.String) : java.lang.StringBuilder [25]
25 invokevirtual java.lang.StringBuilder.toString() : java.lang.String [31]
28 astore_2 [str2]
29 getstatic java.lang.System.out : java.io.PrintStream [35]
32 aload_2 [str2]
33 invokevirtual java.io.PrintStream.println(java.lang.String) : void [41]
36 return

Does string.format is more efficient than append to string using '+'

Appending using operators is generally more efficient. Format has to take the string and find "%"'s and so, and replace them with corresponding values. Appending is simpler, and shorter to type!

Imagine you are the compiler.

Go through the string to find the %s symbol. Replace it with the value there. Then concatenate.

versus

Concatenate.

When is it better to use String.Format vs string concatenation?

Before C# 6

To be honest, I think the first version is simpler - although I'd simplify it to:

xlsSheet.Write("C" + rowIndex, null, title);

I suspect other answers may talk about the performance hit, but to be honest it'll be minimal if present at all - and this concatenation version doesn't need to parse the format string.

Format strings are great for purposes of localisation etc, but in a case like this concatenation is simpler and works just as well.

With C# 6

String interpolation makes a lot of things simpler to read in C# 6. In this case, your second code becomes:

xlsSheet.Write($"C{rowIndex}", null, title);

which is probably the best option, IMO.



Related Topics



Leave a reply



Submit