String.Replace() VS. Stringbuilder.Replace()

String.Replace() vs. StringBuilder.Replace()

Using RedGate Profiler using the following code

class Program
{
static string data = "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz";
static Dictionary<string, string> values;

static void Main(string[] args)
{
Console.WriteLine("Data length: " + data.Length);
values = new Dictionary<string, string>()
{
{ "ab", "aa" },
{ "jk", "jj" },
{ "lm", "ll" },
{ "yz", "zz" },
{ "ef", "ff" },
{ "st", "uu" },
{ "op", "pp" },
{ "x", "y" }
};

StringReplace(data);
StringBuilderReplace1(data);
StringBuilderReplace2(new StringBuilder(data, data.Length * 2));

Console.ReadKey();
}

private static void StringReplace(string data)
{
foreach(string k in values.Keys)
{
data = data.Replace(k, values[k]);
}
}

private static void StringBuilderReplace1(string data)
{
StringBuilder sb = new StringBuilder(data, data.Length * 2);
foreach (string k in values.Keys)
{
sb.Replace(k, values[k]);
}
}

private static void StringBuilderReplace2(StringBuilder data)
{
foreach (string k in values.Keys)
{
data.Replace(k, values[k]);
}
}
}
  • String.Replace = 5.843ms
  • StringBuilder.Replace #1 = 4.059ms
  • Stringbuilder.Replace #2 = 0.461ms

String length = 1456

stringbuilder #1 creates the stringbuilder in the method while #2 does not so the performance difference will end up being the same most likely since you're just moving that work out of the method. If you start with a stringbuilder instead of a string then #2 might be the way to go instead.

As far as memory, using RedGateMemory profiler, there is nothing to worry about until you get into MANY replace operations in which stringbuilder is going to win overall.

StringBuilder vs. String considering replace

It is true that StringBuilder tends to be better than concatenating or modifying Strings manually, since StringBuilder is mutable, while String is immutable and you need to create a new String for each modification.

Just to note, though, the Java compiler will automatically convert an example like this:

String result = someString + someOtherString + anotherString;

into something like:

String result = new StringBuilder().append(someString).append(someOtherString).append(anotherString).toString();

That said, unless you're replacing a whole lot of Strings, go for whichever is more readable and more maintainable. So if you can keep it cleaner by having a sequence of 'replace' calls, go ahead and do that over the StringBuilder method. The difference will be negligible compared to the stress you save from dealing with the sad tragedy of micro-optimizations.

PS

For your code sample (which, as OscarRyz pointed out, won't work if you have more than one "$VARIABLE1" in someString, in which case you'll need to use a loop), you could cache the result of the indexOf call in:

someString.replace(someString.indexOf("$VARIABLE1"), someString.indexOf("$VARIABLE1")+10, "abc");

With

int index = someString.indexOf("$VARIABLE1");    
someString.replace(index, index+10, "abc");

No need to search the String twice :-)

Java String.replace() or StringBuilder.replace()

Using StringBuilder won't make a useful difference here.

A better improvement would be to use a single regex:

someData = someData.replaceAll("(?s)<tag_(one|two|three|four|five)>.*?</tag_\\1>", "");

Here, the \\1 matches the same thing that was captured in the (one|two|etc) group.

Is StringBuilder.Replace() more efficient than String.Replace?

This is exactly the type of thing StringBuilder is for - repeated modification of the same text object - it's not just for repeated concatenation, though that appears to be what it's used for most commonly.

string.replace vs StringBuilder.replace for memory

When you use:

string temp2 = temp.Replace("\n","\r\n")

for every match of "\n" in the string temp, the system creates a new string with the replacement.

With StringBuilder this doesn't happens because StringBuilder is mutable, so you can actually modify the same object without the need to create another one.

Example:

temp = "test1\ntest2\ntest3\n"

With First Method (string)

string temp2 = temp.Replace("\n","\r\n")

is equivalent to

string aux1 = "test1\r\ntest2\ntest3\n"
string aux2 = "test1\r\ntest2\r\ntest3\n"
string temp2 = "test1\r\ntest2\r\ntest3\r\n"

With Secon Method (StringBuilder)

string temp2 = new StringBuilder(temp).Replace("\n","\r\n").toString()

is equivalent to

Stringbuilder aux = "test1\ntest2\ntest3\n"
aux = "test1\r\ntest2\ntest3\n"
aux = "test1\r\ntest2\r\ntest3\n"
aux = "test1\r\ntest2\r\ntest3\r\n"
string temp2 = aux.toString()

Why StringBuilder.Replace is slower than String.Replace

According to several tests (links to more tests at the bottom) as well as a quick and sloppy test of my own, String.Replace performs better than StringBuilder.Replace. You do not seem to be missing anything.

For completeness sake, here's my testing code:

int big = 500;
String s;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 100; ++i)
{
sb.Append("cat mouse");
}
s = sb.ToString();

Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < big; ++i)
{
s = s.Replace("cat", "moo");
s = s.Replace("moo", "cat");
}
sw.Stop(); Trace.WriteLine(sw.ElapsedMilliseconds); sw.Reset(); sw.Start();
for (int i = 0; i < big; ++i)
{
sb.Replace("cat", "moo");
sb.Replace("moo", "cat");
}
sw.Stop(); Trace.WriteLine(sw.ElapsedMilliseconds); sw.Reset(); sw.Start();
for (int i = 0; i < big; ++i)
{
s = s.Replace("cat", "mooo");
s = s.Replace("mooo", "cat");
}
sw.Stop(); Trace.WriteLine(sw.ElapsedMilliseconds); sw.Reset(); sw.Start();
for (int i = 0; i < big; ++i)
{
sb.Replace("cat", "mooo");
sb.Replace("mooo", "cat");
}
sw.Stop(); Trace.WriteLine(sw.ElapsedMilliseconds);

The output, on my machine, is:

9
11
7
1977

[EDIT]

I missed one very important case. That is the case where every time the string is replaced with something else. This could matter because of the way C# handles strings. What follows is the code that tests the missing case, and the results on my system.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Text;
class Program
{
static void Main()
{
var repl = GenerateRandomStrings(4, 500);
String s;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 100; ++i)
{
sb.Append("cat mouse");
}
s = sb.ToString();
Stopwatch sw = new Stopwatch();
sw.Start();
foreach (string str in repl)
{
s = s.Replace("cat", str);
s = s.Replace(str, "cat");
}
sw.Stop(); Trace.WriteLine(sw.ElapsedMilliseconds); sw.Reset(); sw.Start();
foreach (string str in repl)
{
sb.Replace("cat", str);
sb.Replace(str, "cat");
}
sw.Stop(); Trace.WriteLine(sw.ElapsedMilliseconds);
}

static HashSet<string> GenerateRandomStrings(int length, int amount)
{
HashSet<string> strings = new HashSet<string>();
while (strings.Count < amount)
strings.Add(RandomString(length));
return strings;
}

static Random rnd = new Random();
static string RandomString(int length)
{
StringBuilder b = new StringBuilder();
for (int i = 0; i < length; ++i)
b.Append(Convert.ToChar(rnd.Next(97, 122)));
return b.ToString();
}
}

Output:

8
1933

However, as we start to increase the length of the random strings, the StringBuilder solution comes closer and closer to the String solution. For random strings with a length of 1000 characters, my results are

138
328

Using this new knowledge on the old tests, I get similar results when increasing the length of the string to replace with. When replacing with a string that is a thousand 'a' characters instead of "mooo", my results for the original answer become:

8
11
160
326

Although the results do become closer, it still seems that for any real world use, String.Replace beats StringBuilder.Replace.

How can I know if String.Replace or StringBuilder.Replace will modify the string?

You can speed this up a little by modifying your second method like so:

var pos = myString.ToString().IndexOf("a");
if (pos > 0)
{
myString = myString.Replace("a", "b", pos, myString.Length - pos);
//After this line the string is replaced.
//My Code
}

We now call the overload of StringBuilder.Replace() which specifies a starting index.

Now it doesn't need to search the first part of the string again. This is unlikely to save much time though - but it will save a little.

Regex.Replace, String.Replace or StringBuilder.Replace which is the fastest?

If you're just trying to do it within a single string, I'd expect string.Replace to be as fast as anything else. StringBuilder is useful when you want to perform a number of separate steps and want to avoid creating an intermediate string on each step.

Have you benchmarked string.Replace to find out whether or not it's fast enough for you?

I would personally only start using regular expressions when I was actually dealing with a pattern, rather than just a fixed sequence of characters. If the performance of this is absolutely crucial, you could benchmark that as well of course.

Replace String with StringBuilder?

StringBuilder is a class which you can pass as any other Object:

public static void passStringBuilder(StringBuilder sb) {
int yearSlash = sb.lastIndexOf("/");
sb.delete(yearSlash + 1, sb.length());
sb,append("2012");
System.out.println("New date string: " + sb.toString());
}

StringBuilder#substring returns a String which is a substring, leaving sb as it was. You need to delete certain characters.

Basically, StringBuilder holds characters and lets you easily manipulate them. Later, it allows you to extract String from it. You can use substring or toString.

Since Java 8, it makes no sense to use it for concatenation as String + String is automatically changed to StringBuilder by Java.



Related Topics



Leave a reply



Submit