C# Sort and OrderBy comparison
Why not measure it:
class Program
{
class NameComparer : IComparer<string>
{
public int Compare(string x, string y)
{
return string.Compare(x, y, true);
}
}
class Person
{
public Person(string id, string name)
{
Id = id;
Name = name;
}
public string Id { get; set; }
public string Name { get; set; }
}
static void Main()
{
List<Person> persons = new List<Person>();
persons.Add(new Person("P005", "Janson"));
persons.Add(new Person("P002", "Aravind"));
persons.Add(new Person("P007", "Kazhal"));
Sort(persons);
OrderBy(persons);
const int COUNT = 1000000;
Stopwatch watch = Stopwatch.StartNew();
for (int i = 0; i < COUNT; i++)
{
Sort(persons);
}
watch.Stop();
Console.WriteLine("Sort: {0}ms", watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < COUNT; i++)
{
OrderBy(persons);
}
watch.Stop();
Console.WriteLine("OrderBy: {0}ms", watch.ElapsedMilliseconds);
}
static void Sort(List<Person> list)
{
list.Sort((p1, p2) => string.Compare(p1.Name, p2.Name, true));
}
static void OrderBy(List<Person> list)
{
var result = list.OrderBy(n => n.Name, new NameComparer()).ToArray();
}
}
On my computer when compiled in Release mode this program prints:
Sort: 1162ms
OrderBy: 1269ms
UPDATE:
As suggested by @Stefan here are the results of sorting a big list fewer times:
List<Person> persons = new List<Person>();
for (int i = 0; i < 100000; i++)
{
persons.Add(new Person("P" + i.ToString(), "Janson" + i.ToString()));
}
Sort(persons);
OrderBy(persons);
const int COUNT = 30;
Stopwatch watch = Stopwatch.StartNew();
for (int i = 0; i < COUNT; i++)
{
Sort(persons);
}
watch.Stop();
Console.WriteLine("Sort: {0}ms", watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < COUNT; i++)
{
OrderBy(persons);
}
watch.Stop();
Console.WriteLine("OrderBy: {0}ms", watch.ElapsedMilliseconds);
Prints:
Sort: 8965ms
OrderBy: 8460ms
In this scenario it looks like OrderBy performs better.
UPDATE2:
And using random names:
List<Person> persons = new List<Person>();
for (int i = 0; i < 100000; i++)
{
persons.Add(new Person("P" + i.ToString(), RandomString(5, true)));
}
Where:
private static Random randomSeed = new Random();
public static string RandomString(int size, bool lowerCase)
{
var sb = new StringBuilder(size);
int start = (lowerCase) ? 97 : 65;
for (int i = 0; i < size; i++)
{
sb.Append((char)(26 * randomSeed.NextDouble() + start));
}
return sb.ToString();
}
Yields:
Sort: 8968ms
OrderBy: 8728ms
Still OrderBy is faster
Order By vs Sort for creating ordered collection
- You use
Sort
when you want to sort the original list. It performs an in-place sort. - You use
OrderBy
when you don't want to change the original list as it returns anIOrderedEnumerable<T>
that leaves the original list untouched. Or when you don't have a list but some other enumerable.
Sorting a List in C# using List.Sort(Comparison T comparison
You can use Linq OrderBy method for that -
sm = sm.OrderBy(i => i.num_of_words).ToList();
Why is OrderBy which returns IOrderedEnumerable T much faster than Sort?
In this case, OrderBy
is far faster because you're not actually executing it.
Until you enumerate the results, the query is deferred, so it's never actually doing the ordering. Until you actually enumerate through the results, the IOrderedEnumerable<T>
doesn't process the input and do any form of ordering.
Try changing your benchmark to:
BenchMark(persons => people = persons.OrderBy(n => n.Name).Count());
The Count()
call will force the ordering to actually occur (since it needs to enumerate the IOrderedEnumerable<T>
to generate a count), which should even out your timings significantly.
Most LINQ extension methods work this way - until you enumerate them (via Count()
, calling ToList()
, or just using them in a normal foreach
loop, etc), they will have negligible impact, as they don't actually do anything directly other than build the enumerable. The reason the other benchmarks compare to OrderBy(...).ToList()
is that the addition of ToList()
forces the OrderBy
to fully execute and actually order the results.
LINQ orderby vs IComparer
I would choose LINQ for two reasons.
- LINQ queries are generally shorter and easier to read.
- If you really do have a large number of elements, Linq also gives you the ability to scale out to multiple CPU cores by using PLinq, which might help you out significantly.
I would expect performance to be roughly similar for a single-threaded implementation, if you consider that the lambda expression in your OrderBy clause compiles to a function -- which is pretty much all you get by implementing IComparer anyway.
That being said, you might get more of a performance boost by changing your sort algorithm to be tailored to how your data is already sorted, rather than by changing your comparison method. But I'd be willing to bet my coffee this morning that OrderBy in your Linq statements uses an implementation of Quicksort, so it's probably pretty decent in the general case already.
Easiest method to OrderBy a String using StringComparison.Ordinal
There is a pre-built StringComparer
that applies StringComparison.Ordinal
- that's StringComparer.Ordinal
:
items.OrderBy(i => i.Name, StringComparer.Ordinal)
How to sort list using multiple comparators
You can have more complex functions passed as a lambda for the comparer, for example:
allFilesData.Sort((x, y) =>
{
int c = DateTime.Compare(x.timeStamp, y.timeStamp);
if (c == 0) {
c = x.type.Length.CompareTo(y.type.Length);
if (c == 0)
return x.index.CompareTo(y.index);
}
return c;
});
(assuming the type and index are stored in type
(string
) and index
(int
) members.
Related Topics
How to Close a Streamwriter Without Closing Its Basestream
Dependency Injection in Attributes
How to Show the "Paste JSON Class" in Visual Studio 2012 When Clicking on Paste Special
In Unity (C#), Why am I Getting a Nullreferenceexception and How to Fix It
What How to Use for Good Quality Code Coverage for C#/.Net
Why Would I Ever Need to Use C# Nested Classes
Setting an Object to Null VS Dispose()
How to Update Record Using Entity Framework 6
How to Create a Custom Membership Provider for ASP.NET MVC 2
How to Seed a Random Class to Avoid Getting Duplicate Random Values
Is Using a 'Goto' Statement Bad
When to Use Blockingcollection and When Concurrentbag Instead of List<T>
Read Xml Attribute Using Xmldocument
Serializing and Deserializing Expression Trees in C#
Stop Visual Studio Debug Putting Slash in String Containing Double Quotes