Remove Duplicates from a List≪T≫ in C#

Remove duplicates from a ListT in C#

Perhaps you should consider using a HashSet.

From the MSDN link:

using System;
using System.Collections.Generic;

class Program
{
static void Main()
{
HashSet<int> evenNumbers = new HashSet<int>();
HashSet<int> oddNumbers = new HashSet<int>();

for (int i = 0; i < 5; i++)
{
// Populate numbers with just even numbers.
evenNumbers.Add(i * 2);

// Populate oddNumbers with just odd numbers.
oddNumbers.Add((i * 2) + 1);
}

Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
DisplaySet(evenNumbers);

Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
DisplaySet(oddNumbers);

// Create a new HashSet populated with even numbers.
HashSet<int> numbers = new HashSet<int>(evenNumbers);
Console.WriteLine("numbers UnionWith oddNumbers...");
numbers.UnionWith(oddNumbers);

Console.Write("numbers contains {0} elements: ", numbers.Count);
DisplaySet(numbers);
}

private static void DisplaySet(HashSet<int> set)
{
Console.Write("{");
foreach (int i in set)
{
Console.Write(" {0}", i);
}
Console.WriteLine(" }");
}
}

/* This example produces output similar to the following:
* evenNumbers contains 5 elements: { 0 2 4 6 8 }
* oddNumbers contains 5 elements: { 1 3 5 7 9 }
* numbers UnionWith oddNumbers...
* numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
*/

remove duplicate items from list in c#

I think this would work:

var result = myClassObject.GroupBy(x => x.BillId)
.Where(x => x.Count() == 1)
.Select(x => x.First());

Fiddle here

Remove duplicates from a Liststring in C#

IEnumerable<Foo> distinctList = sourceList.DistinctBy(x => x.FooName);

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector)
{
var knownKeys = new HashSet<TKey>();
return source.Where(element => knownKeys.Add(keySelector(element)));
}

How to remove duplicates from a ListT?

A HashSet<T> does remove duplicates, because it's a set... but only when your type defines equality appropriately.

I suspect by "duplicate" you mean "an object with equal field values to another object" - you need to override Equals/GetHashCode for that to work, and/or implement IEquatable<Contact>... or you could provide an IEqualityComparer<Contact> to the HashSet<T> constructor.

Instead of using a HashSet<T> you could just call the Distinct LINQ extension method. For example:

list = list.Distinct().ToList();

But again, you'll need to provide an appropriate definition of equality, somehow or other.

Here's a sample implementation. Note how I've made it immutable (equality is odd with mutable types, because two objects can be equal one minute and non-equal the next) and
made
the fields private, with public properties. Finally, I've sealed the class - immutable types should generally be sealed, and it makes equality easier to talk about.

using System;
using System.Collections.Generic;

public sealed class Contact : IEquatable<Contact>
{
private readonly string firstName;
public string FirstName { get { return firstName; } }

private readonly string lastName;
public string LastName { get { return lastName; } }

private readonly string phoneNumber;
public string PhoneNumber { get { return phoneNumber; } }

public Contact(string firstName, string lastName, string phoneNumber)
{
this.firstName = firstName;
this.lastName = lastName;
this.phoneNumber = phoneNumber;
}

public override bool Equals(object other)
{
return Equals(other as Contact);
}

public bool Equals(Contact other)
{
if (object.ReferenceEquals(other, null))
{
return false;
}
if (object.ReferenceEquals(other, this))
{
return true;
}
return FirstName == other.FirstName &&
LastName == other.LastName &&
PhoneNumber == other.PhoneNumber;
}

public override int GetHashCode()
{
// Note: *not* StringComparer; EqualityComparer<T>
// copes with null; StringComparer doesn't.
var comparer = EqualityComparer<string>.Default;

// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + comparer.GetHashCode(FirstName);
hash = hash * 31 + comparer.GetHashCode(LastName);
hash = hash * 31 + comparer.GetHashCode(PhoneNumber);
return hash;
}
}
}

EDIT: Okay, in response to requests for an explanation of the GetHashCode() implementation:

  • We want to combine the hash codes of the properties of this object
  • We're not checking for nullity anywhere, so we should assume that some of them may be null. EqualityComparer<T>.Default always handles this, which is nice... so I'm using that to get a hash code of each field.
  • The "add and multiply" approach to combining several hash codes into one is the standard one recommended by Josh Bloch. There are plenty of other general-purpose hashing algorithms, but this one works fine for most applications.
  • I don't know whether you're compiling in a checked context by default, so I've put the computation in an unchecked context. We really don't care if the repeated multiply/add leads to an overflow, because we're not looking for a "magnitude" as such... just a number that we can reach repeatedly for equal objects.

Two alternative ways of handling nullity, by the way:

public override int GetHashCode()
{
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + (FirstName ?? "").GetHashCode();
hash = hash * 31 + (LastName ?? "").GetHashCode();
hash = hash * 31 + (PhoneNumber ?? "").GetHashCode();
return hash;
}
}

or

public override int GetHashCode()
{
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + (FirstName == null ? 0 : FirstName.GetHashCode());
hash = hash * 31 + (LastName == null ? 0 : LastName.GetHashCode());
hash = hash * 31 + (PhoneNumber == null ? 0 : PhoneNumber.GetHashCode());
return hash;
}
}

How to remove duplicate entries from a list of list

You can use some handy Linq extension methods to get the job done. SelectMany will flatten the lists and select all the items, and Distinct will remove any duplicates:

List<string> mergedLists = ListsToMerge.SelectMany(x => x).Distinct().ToList();

Remove duplicates from a ListT based on a condition in C#

System.Linq has a Distinct method. You'd have to implement an IEqualityComparer. Details of how here...

https://msdn.microsoft.com/en-us/library/bb338049(v=vs.110).aspx


Edit based on your comment: If you do an orderBy it should keep the one you want... here's some code...

using System.Collections.Generic;
using System.Linq;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
var data = new[]
{
new SomeClass { SomeId = 1, AnotherId = 1, SomeOtherId = 1, Timestamp = 10 },
new SomeClass { SomeId = 1, AnotherId = 1, SomeOtherId = 1, Timestamp = 20 }, // Duplicate
new SomeClass { SomeId = 1, AnotherId = 2, SomeOtherId = 2, Timestamp = 30 },
new SomeClass { SomeId = 1, AnotherId = 2, SomeOtherId = 2, Timestamp = 35 }, // Duplicate
new SomeClass { SomeId = 2, AnotherId = 4, SomeOtherId = 4, Timestamp = 40 },
new SomeClass { SomeId = 3, AnotherId = 2, SomeOtherId = 2, Timestamp = 50 },
new SomeClass { SomeId = 1, AnotherId = 1, SomeOtherId = 1, Timestamp = 50 } // Duplicate
};

var distinctList = data
.OrderBy(x => x.Timestamp)
.Distinct(new SomeClassComparer())
.ToList();
}

public class SomeClass
{
public int SomeId { get; set; }
public int AnotherId { get; set; }
public int SomeOtherId { get; set; }
public int Timestamp { get; set; }
}

public class SomeClassComparer : IEqualityComparer<SomeClass>
{
public bool Equals(SomeClass x, SomeClass y)
{
if (ReferenceEquals(x, y))
{
return true;
}

//Check whether any of the compared objects is null.
if (ReferenceEquals(x, null) || ReferenceEquals(y, null))
{
return false;
}

//Check whether the SomeClass's properties are equal.
return x.SomeId == y.SomeId &&
x.AnotherId == y.AnotherId &&
x.SomeOtherId == y.SomeOtherId;
}

public int GetHashCode(SomeClass someClass)
{
//Check whether the object is null
if (ReferenceEquals(someClass, null))
{
return 0;
}

//Get hash code for the fields
var hashSomeId = someClass.SomeId.GetHashCode();
var hashAnotherId = someClass.AnotherId.GetHashCode();
var hashSomeOtherId = someClass.SomeOtherId.GetHashCode();

//Calculate the hash code for the SomeClass.
return (hashSomeId ^ hashAnotherId) ^ hashSomeOtherId;
}
}
}
}

Remove duplicate from a list of type class

Say your class is ProgramEntry:

public class ProgramEntry {

public long Id;
public string Name;
public long VM;
public long Vm;

public ProgramEntry (long id, string name, long vM, long vm) {
Id = id;
Name = name;
VM = vM;
Vm = vm;
}

public override string ToString () {
return this.Id+":"+this.Name+"("+this.VM+"."+this.Vm+")";
}

}

(yes, using public fields is not good practice, but it simply a quick-and-dirty solution)

Now you can order them by version (first major, then minor):

List<ProgramEntry> programs = new List<ProgramEntry>();
//fill list with programs
var order = programs.OrderBy(x => -x.VM).ThenBy(x => -x.Vm);

This results in a IEnumerable<ProgramEntry> ordered with largest major first, and in case of equivalent major, largest minor first.

Next you can use this duplicate filter, to filter out elements with the same Name:

List<ProgramEntry> result = order.DistinctBy(x => x.Name).ToList();

The DistinctBy is by the way part of the MoreLINQ library. Or you can implement it yourself using an extension class:

public static class Foo {

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector) {
HashSet<TKey> seenKeys = new HashSet<TKey>();
foreach (TSource element in source) {
if (seenKeys.Add(keySelector(element))) {
yield return element;
}
}
}

}

Demo (using the csharp interactive shell):

$ csharp
Mono C# Shell, type "help;" for help

Enter statements below.
csharp> public class ProgramEntry {
>
> public long Id;
> public string Name;
> public long VM;
> public long Vm;
>
> public ProgramEntry (long id, string name, long vM, long vm) {
> Id = id;
> Name = name;
> VM = vM;
> Vm = vm;
> }
>
> public override string ToString () {
> return this.Id+":"+this.Name+"("+this.VM+"."+this.Vm+")";
> }
>
> }
csharp> List<ProgramEntry> programs = new List<ProgramEntry>();
csharp> programs.Add(new ProgramEntry(1,"ssim",2,1));
csharp> programs.Add(new ProgramEntry(2,"ssim",3,1));
csharp> programs.Add(new ProgramEntry(3,"Counter",5,1));
csharp> programs.Add(new ProgramEntry(4,"Counter",6,2));
csharp> programs.Add(new ProgramEntry(5,"Counter",6,5));
csharp> programs
{ 1:ssim(2.1), 2:ssim(3.1), 3:Counter(5.1), 4:Counter(6.2), 5:Counter(6.5) }
csharp> var order = programs.OrderBy(x => -x.VM).ThenBy(x => -x.Vm);
csharp> order
{ 5:Counter(6.5), 4:Counter(6.2), 3:Counter(5.1), 2:ssim(3.1), 1:ssim(2.1) }
csharp> List<ProgramEntry> result = order.DistinctBy(x => x.Name).ToList();
csharp> result
{ 5:Counter(6.5), 2:ssim(3.1) }

Is this the expected behavior?



Related Topics



Leave a reply



Submit