Intersect with a Custom Iequalitycomparer Using Linq

Intersect with a custom IEqualityComparer using Linq

First of all this is wrong:

public bool Equals(MyClass item1, MyClass item2)
{
return GetHashCode(item1) == GetHashCode(item2);
}

If the hashcode's are different for sure the corresponding 2 items are different, but if they're equal is not guaranteed that the corresponding 2 items are equal.

So this is the correct Equals implementation:

public bool Equals(MyClass item1, MyClass item2)
{
if(object.ReferenceEquals(item1, item2))
return true;
if(item1 == null || item2 == null)
return false;
return item1.MyString1.Equals(item2.MyString1) &&
item1.MyString2.Equals(item2.MyString2);
}

As Slacks suggested (anticipating me) the code is the following:

var Default = new List<MyClass>
{
new MyClass{MyString1="A",MyString2="A",MyString3="-"},
new MyClass{MyString1="B",MyString2="B",MyString3="-"},
new MyClass{MyString1="X",MyString2="X",MyString3="-"},
new MyClass{MyString1="Y",MyString2="Y",MyString3="-"},
new MyClass{MyString1="Z",MyString2="Z",MyString3="-"},

};
var Good = new List<MyClass>
{
new MyClass{MyString1="A",MyString2="A",MyString3="+"},
new MyClass{MyString1="B",MyString2="B",MyString3="+"},
new MyClass{MyString1="C",MyString2="C",MyString3="+"},
new MyClass{MyString1="D",MyString2="D",MyString3="+"},
new MyClass{MyString1="E",MyString2="E",MyString3="+"},
};
var wantedResult = Good.Intersect(Default, new MyEqualityComparer())
.Union(Default, new MyEqualityComparer());

// wantedResult:
// A A +
// B B +
// X X -
// Y Y -
// Z Z -

IEnumerable.Intersect with custom comparer, don't understand behaviour

public override int GetHashCode(string obj)
{
return obj.GetHashCode();
}

If Equals(a, b) then it is required that GetHashCode(a) == GetHashCode(b). This is not guarnateed by your EqualityComparer, and this bug means the matching values are not found by the Intersect() call.

A sensible enough implementation to correspond with your Equals would be:

public override int GetHashCode(string obj)
{
if (obj == null) return 0;
if (_re.IsMatch(obj)) return 1;
return obj.GetHashCode(); // catch the reference-equals part for non-matches.
}

The fact that two strings with the same characters would be considered non-equal (i.e. it considers new string('1', 1) different to "1") is perhaps deliberate or perhaps a bug. Maybe the ReferenceEquals() should be string's ==?

Intersect or union with a custom IEqualityComparer using Linq

It sounds like you might want a join, or you might just want to concatenate the collections, group by the key and then sum the properties:

// Property names changed to conform with normal naming conventions
var results = collection1.Concat(collection2)
.GroupBy(x => x.key)
.Select(g => new Item {
Key = g.Key,
Total1 = g.Sum(x => x.Total1),
Total2 = g.Sum(x => x.Total2)
});

Collection priority in LINQ Intersect, Union, using IEqualityComparer

The first collection should win always.

MSDN:

When the object returned by this method is enumerated, Union
enumerates first and second in that order and yields each element that
has not already been yielded.

Here is the implementation of Union(ILSPY, .NET 4), the first collection is enumerated first:

// System.Linq.Enumerable
private static IEnumerable<TSource> UnionIterator<TSource>(IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
{
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource current in first)
{
if (set.Add(current))
{
yield return current;
}
}
foreach (TSource current2 in second)
{
if (set.Add(current2))
{
yield return current2;
}
}
yield break;
}

The same applies to Intersect (and other similar methods in Linq-To-Objects as well):

When the object returned by this method is enumerated, Intersect
enumerates first, collecting all distinct elements of that sequence.
It then enumerates second, marking those elements that occur in both
sequences. Finally, the marked elements are yielded in the order in
which they were collected
.

Update: As Rawling has mentioned in his comment MSDN lies at the documentation of Intersect. I've looked at Intersect with ILSpy and it enumerates the second collection first and only then the first, even if is documented the other way around.

Actually Jon Skeet has also mentioned this "lie" in EduLinq: http://msmvps.com/blogs/jon_skeet/archive/2010/12/30/reimplementing-linq-to-objects-part-16-intersect-and-build-fiddling.aspx (in his words: "This is demonstrably incorrect.")

However, even if it isn't implemented as expected it will still return the element of the first collection as you can see in the implementation:

// System.Linq.Enumerable
private static IEnumerable<TSource> IntersectIterator<TSource>(IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
{
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource current in second)
{
set.Add(current);
}
foreach (TSource current2 in first)
{
if (set.Remove(current2))
{
yield return current2;
}
}
yield break;
}

Custom intersect in lambda

Sure you can! You can use this overload of Linq's Intersect extension method which takes an IEqualityComparer<T>, like this:

public class FooComparer : IEqualityComparer<Foo> 
{
public bool Equals(Foo x, Foo y)
{
return x.Id == y.Id && x.someKey != y.someKey;
}

public int GetHashCode(Foo x)
{
return x.Id.GetHashCode();
}
}

...

var comparer = new FooComparer();
List<Foo> listOne = service.GetListOne();
List<Foo> listTwo = service.GetListTwo();
List<Foo> result = listOne.Intersect(listTwo, comparer).ToList();

Linq Except with custom IEqualityComparer

The solution which I ended up using could not be described as fast, but that is not a concern of mine and it does what I want in that it can be re-used and is not restricted to any particular class.

It uses the Newtonsoft.Json library to serialize the object to a string and then compares the result. This also has the advantage of working with anonymous classes and nested classes.

I am assuming that the way the comparison works is that it first calls GetHashCode on both objects and if they match it then calls Equals, which in this routine will mean that matching objects will be serialized twice.

public class JSonEqualityComparer<T> : IEqualityComparer<T>
{
public bool Equals(T x, T y)
{
return String.Equals
(
Newtonsoft.Json.JsonConvert.SerializeObject(x),
Newtonsoft.Json.JsonConvert.SerializeObject(y)
);
}

public int GetHashCode(T obj)
{
return Newtonsoft.Json.JsonConvert.SerializeObject(obj).GetHashCode();
}
}

public static partial class LinqExtensions
{
public static IEnumerable<T> ExceptUsingJSonCompare<T>
(this IEnumerable<T> first, IEnumerable<T> second)
{
return first.Except(second, new JSonEqualityComparer<T>());
}
}

To use it you swap Except with ExceptUsingJSonCompare, for example :

var differences = list2.ExceptUsingJSonCompare(list1); 

IEqualityComparer to use Except, Intersect

Ok, I figured out the problem wasn't with code, it was with my logic.

When selecteing the previous records, I wasn't excluding the current record set. So when comparing, all records were considered old because all the records were present in the historical records, including the records I was trying to check against.

basically just needed to update to this line

where i.CustomerId == customerId && i.ImportId != importId

Custom IEqualityComparer to calculate Difference between two lists using LINQ Except

You should pass the same type used by your list to the Except method, in your example, you are using Guid but it should be of type Contact, also, your Contact class doesn't have a property called "ObjectId", try changing that for "ContactId", the following seems to work fine:

static void Main(string[] args)
{
var list1 = new List<Contact>();
list1.Add(new Contact() { ContactId = Guid.Parse("FB58F102-0CE4-4914-ABFF-ABBD3895D719") });
list1.Add(new Contact() { ContactId = Guid.Parse("5A201238-6036-4385-B848-DEE598A3520C") });

var list2 = new List<Contact>();
list2.Add(new Contact() { ContactId = Guid.Parse("FB58F102-0CE4-4914-ABFF-ABBD3895D719") });

var list3 = list1.Except(list2, new PropertyComparer<Contact>("ContactId"));

foreach (var item in list3)
Console.WriteLine(item.ContactId.ToString());

Console.ReadLine();
}

public class Contact
{
public Guid ContactId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}

public class PropertyComparer<T> : IEqualityComparer<T>
{
private PropertyInfo _PropertyInfo;
public PropertyComparer(string propertyName)
{
_PropertyInfo = typeof(T).GetProperty(propertyName,
BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
if (_PropertyInfo == null)
{
throw new ArgumentException(
string.Format("{0} is not a property of type {1}.", propertyName, typeof(T)));
}
}

public int GetHashCode(T obj)
{
object propertyValue = _PropertyInfo.GetValue(obj, null);
if (obj == null) return 0;
return propertyValue.GetHashCode();
}

public bool Equals(T x, T y)
{
object xValue = _PropertyInfo.GetValue(x, null);
object yValue = _PropertyInfo.GetValue(y, null);

if (xValue == null)
{
return yValue == null;
}

return xValue.Equals(yValue);
}
}

Output:

5a201238-6036-4385-b848-dee598a3520c

Substring comparer on Intersect

You're doing an intersection between two lists, which will give you the common items between them. Since neither list contains an identical item, you are getting no results.

If you want to get all the items from lst that contain an item from num, then you can do something like the code below, which uses the string.Contains method to filter the items from lst:

var fin = lst.Where(item => num.Any(item.Contains));

Result:

{ "abcXdef", "abcXdef", "aYcde" }

Alternatively, if you do want to do a case-insensitive query, you can use the IndexOf method instead:

var fin = lst.Where(item => num.Any(n => 
item.IndexOf(n, StringComparison.OrdinalIgnoreCase) >= 0));

If that's hard to understand (sometimes Linq is), the first code snippet above is a shorthand way of writing the following:

var fin = new List<string>();

foreach (var item in lst)
{
foreach (var n in num)
{
if (item.Contains(n))
{
fin.Add(item);
break;
}
}
}


Related Topics



Leave a reply



Submit