Quickest Way to Compare Two Generic Lists for Differences

Quickest way to compare two generic lists for differences

Use Except:

var firstNotSecond = list1.Except(list2).ToList();
var secondNotFirst = list2.Except(list1).ToList();

I suspect there are approaches which would actually be marginally faster than this, but even this will be vastly faster than your O(N * M) approach.

If you want to combine these, you could create a method with the above and then a return statement:

return !firstNotSecond.Any() && !secondNotFirst.Any();

One point to note is that there is a difference in results between the original code in the question and the solution here: any duplicate elements which are only in one list will only be reported once with my code, whereas they'd be reported as many times as they occur in the original code.

For example, with lists of [1, 2, 2, 2, 3] and [1], the "elements in list1 but not list2" result in the original code would be [2, 2, 2, 3]. With my code it would just be [2, 3]. In many cases that won't be an issue, but it's worth being aware of.

How shall I Quickly compare two generic lists and it's contents for checking if they are identical?

If (and only if) the following is true:

  • Your individual lists do not contain any duplicates
  • The type T of your list elements implements IComparable and GetHashCode() correctly

Then you can remove each list that matches an earlier list like so (note that you must traverse the list backwards when removing items from the end of it otherwise the loop indices could go out of range):

for (int i = lists.Count - 1; i > 0; i--)
{
for (int j = i - 1; j >= 0; j--)
{
if (!lists[i].Except(lists[j]).Any())
{
lists.RemoveAt(i);
break;
}
}
}

The important line here is: !lists[i].Except(lists[j]).Any().

Let's break it down:

lists[i].Except(lists[j]): - This produces a sequence of all the elements of lists[i] that are NOT in lists[j], regardless of order.

Thus if all of the items in lists[j] are also in lists[j], this will produce an empty sequence; otherwise, it will produce a non-empty sequence.

The .Any() will return true for a non-empty sequence, and false for an empty sequence.

So lists[i].Except(lists[j]).Any() will return false if the items are the same in each list and true if they differ.

This is the opposite of what we want for the lists.RemoveAt() so we just negate the result, giving the final code !lists[i].Except(lists[j]).Any().

Compilable console app:

using System;
using System.Collections.Generic;
using System.Linq;

static class Program
{
static void Main()
{
var lists = new List<List<int>>
{
new() {1, 2, 3, 4, 5}, // [0]
new() {2, 3, 4, 5, 6}, // [1]
new() {3, 4, 5, 6, 7}, // [2]
new() {5, 4, 3, 2, 1}, // [3] Dupe of [0]
new() {4, 5, 6, 7, 8}, // [4]
new() {6, 5, 4, 3, 2}, // [5] Dupe of [1]
new() {5, 6, 7, 8, 9}, // [6]
new() {3, 4, 5, 2, 1}, // [7] Dupe of [0]
new() {6, 7, 8, 9, 0} // [8]
};

for (int i = lists.Count - 1; i > 0; i--)
{
for (int j = i - 1; j >= 0; j--)
{
if (!lists[i].Except(lists[j]).Any())
{
lists.RemoveAt(i);
break;
}
}
}

for (int i = 0; i < lists.Count; ++i)
{
Console.WriteLine(string.Join(", ", lists[i]));
}
}

Try it on DotNetFiddle: https://dotnetfiddle.net/nWnOcP

Quickest way to compare two generic lists C#

It sounds like you're trying to verify that all items in list2 are contained within at least one item in list1. If that's the case, this should do the trick:

bool list1ContainsList2Items = list2.All(l2 => list1.Any(l1 => l1.Contains(l2)));

Generic solution to compare two generic list and type known only on run time

Your easiest solution of all would be to implement IEquatable<T> on your 27 different types. Then you can use the regular equals operators (dataFromExcel.Equals(dataFromTable) or dataFromExcel == dataFromTable. However, if you need to be able to do anything with the values during the process, there are other options you have.

Using an idea from this answer, you can get this without having to use reflection. Utilize the Newtonsoft JSON.NET library to do the "heavy lifting" for you, and keep your code readable.

The one thing you'd want to do is restrict the type of each list to the same type. Your method would look something like this:

private bool CompareTwoObjects<T>(T one, T two)
{
var json1 = JObject.FromObject(one);
var json2 = JObject.FromObject(two);

foreach (JProperty prop1 in json1.Properties())
{
var prop2 = json2.Properties().First(p => p.Name == prop1.Name);

if (prop1.Value != prop2.Value)
{
return false;
}
}

return true;
}

Seeing in your comment that you want to compare a collection of each, you can still utilize both options.

IEquatable<T> method:

var allAreEqual = dataFromExcel.All(one => dataFromTable.Any(two => one == two));

Custom method:

var allAreEqual = dataFromExcel(one => dataFromTable.Any(two => CompareTwoObjects(one, two));

I'm sure there's some optimization that you can do here to reduce the N factor, but this points you in the right direction.

How can I compare two lists of objects with ingorance of order?

If you don't have any duplicates, and you just want to see whether both lists contain the same set of IDs, then this is a set operation and the easiest solution uses a HashSet<int>:

bool same = list1.Select(x => x.Id).ToHashSet().SetEquals(list2.Select(x => x.Id));

You can optimize by checking the lengths of your two lists first: if they're different lengths, they're obviously different:

bool same = list1.Count == list2.Count &&
list1.Select(x => x.Id).ToHashSet().SetEquals(list2.Select(x => x.Id));

If you want to get the objects which are different between the two lists, you can use HashSet<T>.SymmetricExceptWith. The problem here is that we now need to be comparing Person objects directly, rather than taking their IDs first, because need those Person objects out of the other side.

Therefore, we'll need to write our own IEqualityComparer:

public class PersonIdComparer : IEqualityComparer<Person>
{
public bool Equals(Person x, Person y) => x.Id == y.Id;
public int GetHashCode(Person x) => x.Id.GetHashCode();
}

var set = list1.ToHashSet(new PersonIdComparer());
set.SymmetricExceptWith(list2);
// set now contains the differences

Most efficient way to compare two generic lists based on id elements contained within nested list (C#)

Your current algorithm seem to be O(n*m*s*s) where n = number of existing items, m = number number of potential matches and s = average number of suppliers for each existingItem/PotentialMatch. You could reduce the running time to O(n*m*s) by using a hash-set for the matching of suppliers.

A generic version would look like this

public static IEnumerable<(T1, T2)> SetJoin<T1, T2, TKey>(
IEnumerable<T1> t1s,
IEnumerable<T2> t2s,
Func<T1, IEnumerable<TKey>> t1Key,
Func<T2, IEnumerable<TKey>> t2Key) where TKey : IEquatable<TKey>
{
foreach (var t1 in t1s)
{
var t1Keys = new HashSet<TKey>(t1Key(t1));
foreach (var t2 in t2s)
{
// t2Key(t2) would be called many times,
// might be worth pre-computing it for each t2.
if (t2Key(t2).Any(t1Keys.Contains))
{
yield return (t1, t2);
}
}
}
}

And call it like

SetJoin<ExistingItems, PotentialMatches, int>(
existingItems,
potentialMatches,
e=> e.Suppliers.Select(s => s.Id),
p => p.Suppliers.Select(s => s.Id))

Also, while linq result in compact and nice code, it is often faster to write the equivalent logic using regular loops if performance is important.

Faster way to compare two lists of differing objects

You can use hash sets (assume you are comparing strings) for quick checking if some value is in set (gives you O(1) complexity instead of O(N)):

 var serverCustomerNumbers = new HashSet<string>(AllServer.Select(c => c.customerNumber));
var localReferences = new HashSet<string>(AllLocal.Select(c => c.Reference));

Now if you need to get whole customer objects

 List<customer> NotOnLocal = 
AllServer.Where(c => !localReferences.Contains(c.customerNumber));

Or you can use set operations to get required customer numbers

 var notLocalCustomerNumbers = serverCustomerNumbers.Except(localReferences);

Compare two different generic lists based on some common columns and return unmatched items

Let's take two classes as an example:

public class ShippingCharges
{
public string ProductLine { get; set; }
public int Family { get; set; }
}

public class FreightCharges
{
public string Brand { get; set; }
public int? Family { get; set; }
}

We add some values:

var ListA = new List<ShippingCharges>() {
new ShippingCharges()
{
ProductLine = "1",
Family = 1
},
new ShippingCharges()
{
ProductLine = "1",
Family = 2
},
};

var ListB = new List<FreightCharges>(){
new FreightCharges()
{
Brand = "2",
Family = 2
},
new FreightCharges()
{
Brand = "3",
Family = 3
},
};

Add some LINQ extensions:

public static class LinqExtensions
{
public static IEnumerable<TSource> Except<TSource, VSource>(this IEnumerable<TSource> first, IEnumerable<VSource> second, Func<TSource, VSource, bool> comparer)
{
return first.Where(x => second.Count(y => comparer(x, y)) == 0);
}

public static IEnumerable<TSource> Contains<TSource, VSource>(this IEnumerable<TSource> first, IEnumerable<VSource> second, Func<TSource, VSource, bool> comparer)
{
return first.Where(x => second.FirstOrDefault(y => comparer(x, y)) != null);
}

public static IEnumerable<TSource> Intersect<TSource, VSource>(this IEnumerable<TSource> first, IEnumerable<VSource> second, Func<TSource, VSource, bool> comparer)
{
return first.Where(x => second.Count(y => comparer(x, y)) == 1);
}
}

Get all elements from list A that are not in list b:

var newData = ListA.Except(ListB, (a,b) => a.Family == b.Family);

Short note: f.Brand != null ? f.Brand : null is equal to f.Brand ?? null is equal to f.Brand

Compare two generic lists using Linq

Note : The return type of your Linq Query will be IEnumerable you need to not create it again by casting;

Have you tried something like this:

public IEnumerable<SalesOrder> GetModifiedRecords(IEnumerable<SalesOrder> oldSalesOrderList, List<SalesOrder> newSalesOrderList)
{
return oldSalesOrderList.Where((x,i)=>newSalesOrderList[i].Value !=x.Value);
}

Above code will works only if both Lists are of same order, if not you can try something like this(Assume that OrderId will be an unique field):

return oldSalesOrderList.Where(x =>
newSalesOrderList.Any(y => y.OrderId == x.OrderId && Y.Value !=x.Value));


Related Topics



Leave a reply



Submit