Memory Efficiency and Performance of String.Replace .Net Framework

Memory Efficiency and Performance of String.Replace .NET Framework

All characters in a .NET string are "unicode chars". Do you mean they're non-ascii? That shouldn't make any odds - unless you run into composition issues, e.g. an "e + acute accent" not being replaced when you try to replace an "e acute".

You could try using a regular expression with Regex.Replace, or StringBuilder.Replace. Here's sample code doing the same thing with both:

using System;
using System.Text;
using System.Text.RegularExpressions;

class Test
{
    static void Main(string[] args)
    {
        string original = "abcdefghijkl";

        Regex regex = new Regex("a|c|e|g|i|k", RegexOptions.Compiled);

        string removedByRegex = regex.Replace(original, "");
        string removedByStringBuilder = new StringBuilder(original)
            .Replace("a", "")
            .Replace("c", "")
            .Replace("e", "")
            .Replace("g", "")
            .Replace("i", "")
            .Replace("k", "")
            .ToString();

        Console.WriteLine(removedByRegex);
        Console.WriteLine(removedByStringBuilder);
    }
}

I wouldn't like to guess which is more efficient - you'd have to benchmark with your specific application. The regex way may be able to do it all in one pass, but that pass will be relatively CPU-intensive compared with each of the many replaces in StringBuilder.

Replace {#Text} and {$Text} in a string, in a performant way

IndexOf vs Regex: Tested using Stopwatch ticks over 100000 iterations with 500~ length string.

Method IndexOf

public static string Re(string str)
{
    int strSIndex = -1;
    int strEIndex = -1;

    strSIndex = str.IndexOf("{#");
    if (strSIndex == -1) strSIndex = str.IndexOf("{$");
    if (strSIndex == -1) return str;

    strEIndex = str.IndexOf("}");
    if (strEIndex == -1) return str;

    if (strEIndex < strSIndex)
    {
        strSIndex = str.IndexOf("{$");
        if (strSIndex == -1) return str;
    }

    str = str.Substring(0, strSIndex) + str.Substring(strEIndex + 1);

    return Re(str);
}

Regex Method

Regex re = new Regex(@"\{(?:#|\$)(\w+)}", RegexOptions.Compiled);
re.Replace(str, "");

Results (few replaces):

Fn: IndexOf
Ticks: 1181967

Fn: Regex
Ticks: 1482261

Notice that regex was set to compile before the iterations.

Results (lots of replaces):

Fn: Regex
Ticks: 19136772

Fn: IndexOf
Ticks: 37457111

replace a character in string without using replace function in .NET

You should have a look at some of the solutions discussed here:

Memory Efficiency and Performance of String.Replace .NET Framework

It mentions the use of Regex.Replace and StringBuilder.Replace

An efficient solution to a String.Replace problem?

It doesn't help with the number of loops, but if you use a StringBuilder as an intermediate it has a .Replace call with the same set of parameter signatures.

Edit:

Not sure if it's faster, but you can use Regex.Replace with an evaluator delegate.

If you build a search regex with your keys:
(key1|key2|key3|key4...)

and then pass in the delegate to .Replace, you can return a lookup based on the Match's Value property.

  public string ReplaceData(Match m)
  {
      return pairs[m.Value];         
  }

...

  pairs.Add("foo","bar");
  pairs.Add("homer","simpson");
  Regex r = new Regex("(?>foo|homer)");
  MatchEvaluator myEval = new MatchEvaluator(class.ReplaceData);
  string sOutput = r.Replace(sInput, myEval);

Performance char vs string

If you have two horses and want to know which is faster...

  String replaceMe = new String('a', 10000000) + 
                     new String('b', 10000000) + 
                     new String('a', 10000000);

  Stopwatch sw = new Stopwatch();

  sw.Start();

  // String replacement 
  if (replaceMe.Contains("a")) {
    replaceMe = replaceMe.Replace("a", "b");
  }

  // Char replacement
  //if (replaceMe.Contains('a')) {
  //  replaceMe = replaceMe.Replace('a', 'b');
  //}

  sw.Stop();

  Console.Write(sw.ElapsedMilliseconds);

I've got 60 ms for Char replacement and 500 ms for String one (Core i5 3.2GHz, 64-bit, .Net 4.6). So

 replaceMe = replaceMe.Replace('a', 'b')

is about 9 times faster

Which one is performance wise to clear a string builder?

Update It turns out that you are using .net 3.5 and Clear was added in .net 4. So you should use Length = 0. Actually I'd probably add an extension method named Clear to do this since it is far more readable, in my view, than Length = 0.

I would use none of those and instead call Clear.

Clear is a convenience method that is equivalent to setting the Length property of the current instance to 0 (zero).

I can't imagine that it's slower than any of your variants and I also can't imagine that clearing a StringBuilder instance could ever be a bottleneck. If there is a bottleneck anywhere it will be in the appending code.

If performance of clearing the object really is a bottleneck then you will need to time your code to know which variant is faster. There's never a real substitute for benchmarking when considering performance.

Using String+string+string vs using string.replace

Your colleague is completely wrong.

He is mis-applying the fact that strings are immutable, and that appending two strings will create a third string object.

Your method (a + b + c) is the most efficient way to do this.

The compiler transforms your code into a call to String.Concat(string[]), which uses unsafe code to allocate a single buffer for all of the strings and copy them into the buffer.

His advice should be to use a StringBuilder when concatenating strings in a loop.

EDIT: String.Concat (which is equivalent to + concatenation, like your first example) is the fastest way to do this. Using a StringBuilder like in your edit will be slower, because it will need to resize the string during each Replace call.

Memory Efficiency and Performance of String.Replace .Net Framework