Remove Duplicates from a List<Int>

Delete duplicates in a List of int arrays

Use GroupBy:

var result = intArrList.GroupBy(c => String.Join(",", c))
.Select(c => c.First().ToList()).ToList();

The result:

{0, 0, 0}

{20, 30, 10, 4, 6}

{1, 2, 5}

{12, 22, 54}

{1, 2, 6, 7, 8}

{0, 0, 0, 0}

EDIT: If you want to consider {1,2,3,4} be equal to {2,3,4,1} you need to use OrderBy like this:

var result = intArrList.GroupBy(p => string.Join(", ", p.OrderBy(c => c)))
.Select(c => c.First().ToList()).ToList();

EDIT2: To help understanding how the LINQ GroupBy solution works consider the following method:

public List<int[]> FindDistinctWithoutLinq(List<int[]> lst)
{
var dic = new Dictionary<string, int[]>();
foreach (var item in lst)
{
string key = string.Join(",", item.OrderBy(c=>c));

if (!dic.ContainsKey(key))
{
dic.Add(key, item);
}
}

return dic.Values.ToList();
}

Remove duplicates from a listint

Using the list::remove_if member function, a temporary hashed set, and lambda expression.

std::list<int> l;
std::unordered_set<int> s;

l.remove_if([&](int n) {
return (s.find(n) == s.end()) ? (s.insert(n), false) : true;
});

Remove duplicates from a ListT in C#

Perhaps you should consider using a HashSet.

From the MSDN link:

using System;
using System.Collections.Generic;

class Program
{
static void Main()
{
HashSet<int> evenNumbers = new HashSet<int>();
HashSet<int> oddNumbers = new HashSet<int>();

for (int i = 0; i < 5; i++)
{
// Populate numbers with just even numbers.
evenNumbers.Add(i * 2);

// Populate oddNumbers with just odd numbers.
oddNumbers.Add((i * 2) + 1);
}

Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
DisplaySet(evenNumbers);

Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
DisplaySet(oddNumbers);

// Create a new HashSet populated with even numbers.
HashSet<int> numbers = new HashSet<int>(evenNumbers);
Console.WriteLine("numbers UnionWith oddNumbers...");
numbers.UnionWith(oddNumbers);

Console.Write("numbers contains {0} elements: ", numbers.Count);
DisplaySet(numbers);
}

private static void DisplaySet(HashSet<int> set)
{
Console.Write("{");
foreach (int i in set)
{
Console.Write(" {0}", i);
}
Console.WriteLine(" }");
}
}

/* This example produces output similar to the following:
* evenNumbers contains 5 elements: { 0 2 4 6 8 }
* oddNumbers contains 5 elements: { 1 3 5 7 9 }
* numbers UnionWith oddNumbers...
* numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
*/

How to remove duplicate List in my ListListint object?

All the arithmetics like 20, 10 + 10, 5 + 15, 2 + 18 and 18 + 2 will be computed at the compile time,
so at the run time you can't distinguish 20's from one another.

However, you may change design from sums (18 + 2) into just tems (18, 2):

// please, notice commas instead of +'s 
var lists = new List<List<int>>() {
new List<int> { 20 },
new List<int> { 10, 10 },
new List<int> { 5, 15 },
new List<int> { 2, 18 },
new List<int> { 18, 2 },
};

In this case case you can implement duplicates eliminations

// simplest, providing that list doesn't contain null's
for (int i = 0; i < lists.Count; ++i) {
// since we want to compare sequecnes, we shall ensure the same order of their items
var item = lists[i].OrderBy(x => x).ToArray();

for (int j = lists.Count - 1; j > i; --j)
if (item.SequenceEqual(lists[j].OrderBy(x => x)))
lists.RemoveAt(j);
}

Test

var result = lists.Select(line => string.Join(" + ", line));
Console.Write(string.Join(Environment.NewLine, result));

The output is

20
10 + 10
5 + 15
2 + 18

Most efficient way to remove duplicates from a List

There is a big difference between these two approaches:

List<int> Result1 = new HashSet<int>(myList).ToList(); //3700 ticks
List<int> Result2 = myList.Distinct().ToList(); //4700 ticks

The first one can (will probably) change the order of the elements of the returned List<>: Result1 elements won't be in the same order of myList's ones. The second maintains the original ordering.

There is probably no faster way than the first one.

There is probably no "more correct" (for a certain definition of "correct" based on ordering) than the second one.

(the third one is similar to the second one, only slower)

Just out of curiousity, the Distinct() is:

// Reference source http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,712
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
return DistinctIterator<TSource>(source, null);
}

// Reference source http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,722
static IEnumerable<TSource> DistinctIterator<TSource>(IEnumerable<TSource> source, IEqualityComparer<TSource> comparer) {
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource element in source)
if (set.Add(element)) yield return element;
}

So in the end the Distinct() simply uses an internal implementation of an HashSet<> (called Set<>) to check for the uniqueness of items.

For completeness sake, I'll add a link to the question Does C# Distinct() method keep original ordering of sequence intact?

C# remove duplicates from ListListint

Build custom of EqualityComparer<List<int>>:

public class CusComparer : IEqualityComparer<List<int>>
{
public bool Equals(List<int> x, List<int> y)
{
return x.SequenceEqual(y);
}

public int GetHashCode(List<int> obj)
{
int hashCode = 0;

for (var index = 0; index < obj.Count; index++)
{
hashCode ^= new {Index = index, Item = obj[index]}.GetHashCode();
}

return hashCode;
}
}

Then you can get the result by using Distinct with custom comparer method:

var result = my_list.Distinct(new CusComparer());

Edit:

Include the index into method GetHashCode to make sure different orders will not be equal

Removing duplicates in lists

The common approach to get a unique collection of items is to use a set. Sets are unordered collections of distinct objects. To create a set from any iterable, you can simply pass it to the built-in set() function. If you later need a real list again, you can similarly pass the set to the list() function.

The following example should cover whatever you are trying to do:

>>> t = [1, 2, 3, 1, 2, 3, 5, 6, 7, 8]
>>> list(set(t))
[1, 2, 3, 5, 6, 7, 8]
>>> s = [1, 2, 3]
>>> list(set(t) - set(s))
[8, 5, 6, 7]

As you can see from the example result, the original order is not maintained. As mentioned above, sets themselves are unordered collections, so the order is lost. When converting a set back to a list, an arbitrary order is created.

Maintaining order

If order is important to you, then you will have to use a different mechanism. A very common solution for this is to rely on OrderedDict to keep the order of keys during insertion:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

Starting with Python 3.7, the built-in dictionary is guaranteed to maintain the insertion order as well, so you can also use that directly if you are on Python 3.7 or later (or CPython 3.6):

>>> list(dict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

Note that this may have some overhead of creating a dictionary first, and then creating a list from it. If you don’t actually need to preserve the order, you’re often better off using a set, especially because it gives you a lot more operations to work with. Check out this question for more details and alternative ways to preserve the order when removing duplicates.


Finally note that both the set as well as the OrderedDict/dict solutions require your items to be hashable. This usually means that they have to be immutable. If you have to deal with items that are not hashable (e.g. list objects), then you will have to use a slow approach in which you will basically have to compare every item with every other item in a nested loop.



Related Topics



Leave a reply



Submit