Intersect with a custom IEqualityComparer using Linq
First of all this is wrong:
public bool Equals(MyClass item1, MyClass item2)
{
return GetHashCode(item1) == GetHashCode(item2);
}
If the hashcode's are different for sure the corresponding 2 items are different, but if they're equal is not guaranteed that the corresponding 2 items are equal.
So this is the correct Equals
implementation:
public bool Equals(MyClass item1, MyClass item2)
{
if(object.ReferenceEquals(item1, item2))
return true;
if(item1 == null || item2 == null)
return false;
return item1.MyString1.Equals(item2.MyString1) &&
item1.MyString2.Equals(item2.MyString2);
}
As Slacks suggested (anticipating me) the code is the following:
var Default = new List<MyClass>
{
new MyClass{MyString1="A",MyString2="A",MyString3="-"},
new MyClass{MyString1="B",MyString2="B",MyString3="-"},
new MyClass{MyString1="X",MyString2="X",MyString3="-"},
new MyClass{MyString1="Y",MyString2="Y",MyString3="-"},
new MyClass{MyString1="Z",MyString2="Z",MyString3="-"},
};
var Good = new List<MyClass>
{
new MyClass{MyString1="A",MyString2="A",MyString3="+"},
new MyClass{MyString1="B",MyString2="B",MyString3="+"},
new MyClass{MyString1="C",MyString2="C",MyString3="+"},
new MyClass{MyString1="D",MyString2="D",MyString3="+"},
new MyClass{MyString1="E",MyString2="E",MyString3="+"},
};
var wantedResult = Good.Intersect(Default, new MyEqualityComparer())
.Union(Default, new MyEqualityComparer());
// wantedResult:
// A A +
// B B +
// X X -
// Y Y -
// Z Z -
IEnumerable.Intersect with custom comparer, don't understand behaviour
public override int GetHashCode(string obj)
{
return obj.GetHashCode();
}
If Equals(a, b)
then it is required that GetHashCode(a) == GetHashCode(b)
. This is not guarnateed by your EqualityComparer
, and this bug means the matching values are not found by the Intersect()
call.
A sensible enough implementation to correspond with your Equals
would be:
public override int GetHashCode(string obj)
{
if (obj == null) return 0;
if (_re.IsMatch(obj)) return 1;
return obj.GetHashCode(); // catch the reference-equals part for non-matches.
}
The fact that two strings with the same characters would be considered non-equal (i.e. it considers new string('1', 1)
different to "1"
) is perhaps deliberate or perhaps a bug. Maybe the ReferenceEquals()
should be string's ==
?
Intersect or union with a custom IEqualityComparer using Linq
It sounds like you might want a join, or you might just want to concatenate the collections, group by the key and then sum the properties:
// Property names changed to conform with normal naming conventions
var results = collection1.Concat(collection2)
.GroupBy(x => x.key)
.Select(g => new Item {
Key = g.Key,
Total1 = g.Sum(x => x.Total1),
Total2 = g.Sum(x => x.Total2)
});
Collection priority in LINQ Intersect, Union, using IEqualityComparer
The first collection should win always.
MSDN:
When the object returned by this method is enumerated, Union
enumerates first and second in that order and yields each element that
has not already been yielded.
Here is the implementation of Union
(ILSPY, .NET 4), the first collection is enumerated first:
// System.Linq.Enumerable
private static IEnumerable<TSource> UnionIterator<TSource>(IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
{
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource current in first)
{
if (set.Add(current))
{
yield return current;
}
}
foreach (TSource current2 in second)
{
if (set.Add(current2))
{
yield return current2;
}
}
yield break;
}
The same applies to Intersect
(and other similar methods in Linq-To-Objects
as well):
When the object returned by this method is enumerated, Intersect
enumerates first, collecting all distinct elements of that sequence.
It then enumerates second, marking those elements that occur in both
sequences. Finally, the marked elements are yielded in the order in
which they were collected.
Update: As Rawling has mentioned in his comment MSDN lies at the documentation of Intersect
. I've looked at Intersect
with ILSpy
and it enumerates the second collection first and only then the first, even if is documented the other way around.
Actually Jon Skeet has also mentioned this "lie" in EduLinq: http://msmvps.com/blogs/jon_skeet/archive/2010/12/30/reimplementing-linq-to-objects-part-16-intersect-and-build-fiddling.aspx (in his words: "This is demonstrably incorrect.")
However, even if it isn't implemented as expected it will still return the element of the first collection as you can see in the implementation:
// System.Linq.Enumerable
private static IEnumerable<TSource> IntersectIterator<TSource>(IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
{
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource current in second)
{
set.Add(current);
}
foreach (TSource current2 in first)
{
if (set.Remove(current2))
{
yield return current2;
}
}
yield break;
}
Custom intersect in lambda
Sure you can! You can use this overload of Linq's Intersect
extension method which takes an IEqualityComparer<T>
, like this:
public class FooComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo x, Foo y)
{
return x.Id == y.Id && x.someKey != y.someKey;
}
public int GetHashCode(Foo x)
{
return x.Id.GetHashCode();
}
}
...
var comparer = new FooComparer();
List<Foo> listOne = service.GetListOne();
List<Foo> listTwo = service.GetListTwo();
List<Foo> result = listOne.Intersect(listTwo, comparer).ToList();
Linq Except with custom IEqualityComparer
The solution which I ended up using could not be described as fast, but that is not a concern of mine and it does what I want in that it can be re-used and is not restricted to any particular class.
It uses the Newtonsoft.Json library to serialize the object to a string and then compares the result. This also has the advantage of working with anonymous classes and nested classes.
I am assuming that the way the comparison works is that it first calls GetHashCode on both objects and if they match it then calls Equals, which in this routine will mean that matching objects will be serialized twice.
public class JSonEqualityComparer<T> : IEqualityComparer<T>
{
public bool Equals(T x, T y)
{
return String.Equals
(
Newtonsoft.Json.JsonConvert.SerializeObject(x),
Newtonsoft.Json.JsonConvert.SerializeObject(y)
);
}
public int GetHashCode(T obj)
{
return Newtonsoft.Json.JsonConvert.SerializeObject(obj).GetHashCode();
}
}
public static partial class LinqExtensions
{
public static IEnumerable<T> ExceptUsingJSonCompare<T>
(this IEnumerable<T> first, IEnumerable<T> second)
{
return first.Except(second, new JSonEqualityComparer<T>());
}
}
To use it you swap Except with ExceptUsingJSonCompare, for example :
var differences = list2.ExceptUsingJSonCompare(list1);
IEqualityComparer to use Except, Intersect
Ok, I figured out the problem wasn't with code, it was with my logic.
When selecteing the previous records, I wasn't excluding the current record set. So when comparing, all records were considered old because all the records were present in the historical records, including the records I was trying to check against.
basically just needed to update to this line
where i.CustomerId == customerId && i.ImportId != importId
Custom IEqualityComparer to calculate Difference between two lists using LINQ Except
You should pass the same type used by your list to the Except method, in your example, you are using Guid
but it should be of type Contact
, also, your Contact
class doesn't have a property called "ObjectId", try changing that for "ContactId
", the following seems to work fine:
static void Main(string[] args)
{
var list1 = new List<Contact>();
list1.Add(new Contact() { ContactId = Guid.Parse("FB58F102-0CE4-4914-ABFF-ABBD3895D719") });
list1.Add(new Contact() { ContactId = Guid.Parse("5A201238-6036-4385-B848-DEE598A3520C") });
var list2 = new List<Contact>();
list2.Add(new Contact() { ContactId = Guid.Parse("FB58F102-0CE4-4914-ABFF-ABBD3895D719") });
var list3 = list1.Except(list2, new PropertyComparer<Contact>("ContactId"));
foreach (var item in list3)
Console.WriteLine(item.ContactId.ToString());
Console.ReadLine();
}
public class Contact
{
public Guid ContactId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
public class PropertyComparer<T> : IEqualityComparer<T>
{
private PropertyInfo _PropertyInfo;
public PropertyComparer(string propertyName)
{
_PropertyInfo = typeof(T).GetProperty(propertyName,
BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
if (_PropertyInfo == null)
{
throw new ArgumentException(
string.Format("{0} is not a property of type {1}.", propertyName, typeof(T)));
}
}
public int GetHashCode(T obj)
{
object propertyValue = _PropertyInfo.GetValue(obj, null);
if (obj == null) return 0;
return propertyValue.GetHashCode();
}
public bool Equals(T x, T y)
{
object xValue = _PropertyInfo.GetValue(x, null);
object yValue = _PropertyInfo.GetValue(y, null);
if (xValue == null)
{
return yValue == null;
}
return xValue.Equals(yValue);
}
}
Output:
5a201238-6036-4385-b848-dee598a3520c
Substring comparer on Intersect
You're doing an intersection between two lists, which will give you the common items between them. Since neither list contains an identical item, you are getting no results.
If you want to get all the items from lst
that contain an item from num
, then you can do something like the code below, which uses the string.Contains
method to filter the items from lst:
var fin = lst.Where(item => num.Any(item.Contains));
Result:
{ "abcXdef", "abcXdef", "aYcde" }
Alternatively, if you do want to do a case-insensitive query, you can use the IndexOf
method instead:
var fin = lst.Where(item => num.Any(n =>
item.IndexOf(n, StringComparison.OrdinalIgnoreCase) >= 0));
If that's hard to understand (sometimes Linq is), the first code snippet above is a shorthand way of writing the following:
var fin = new List<string>();
foreach (var item in lst)
{
foreach (var n in num)
{
if (item.Contains(n))
{
fin.Add(item);
break;
}
}
}
Related Topics
JSON.Net Adding Backslash While Returning JSON Serialized String
Overriding Fields or Properties in Subclasses
Singleton Httpclient VS Creating New Httpclient Request
How to Get the Last Four Characters from a String in C#
How Better to Resolve Dependencies in Object Created by Factory
Syncing SQL Server 2008 Databases Over Http Using Wcf & Sync Framework
Convert Array of Strings to List<String>
How to Reuse Code for Selecting a Custom Dto Object for a Child Property with Ef Core
How to Detect Working Internet Connection in C#
Assert an Exception Using Xunit
Unit Testing That Events Are Raised in C# (In Order)
How to Click on the Radio Button Through the Element Id Attribute Using Selenium and C#
Factory Pattern in C#: How to Ensure an Object Instance Can Only Be Created by a Factory Class
How to Check If a Class Inherits Another Class Without Instantiating It
Cannot Convert Lambda Expression to Type 'String' Because It Is Not a Delegate Type