Delete duplicates in a List of int arrays
Use GroupBy
:
var result = intArrList.GroupBy(c => String.Join(",", c))
.Select(c => c.First().ToList()).ToList();
The result:
{0, 0, 0}
{20, 30, 10, 4, 6}
{1, 2, 5}
{12, 22, 54}
{1, 2, 6, 7, 8}
{0, 0, 0, 0}
EDIT: If you want to consider {1,2,3,4}
be equal to {2,3,4,1}
you need to use OrderBy
like this:
var result = intArrList.GroupBy(p => string.Join(", ", p.OrderBy(c => c)))
.Select(c => c.First().ToList()).ToList();
EDIT2: To help understanding how the LINQ GroupBy
solution works consider the following method:
public List<int[]> FindDistinctWithoutLinq(List<int[]> lst)
{
var dic = new Dictionary<string, int[]>();
foreach (var item in lst)
{
string key = string.Join(",", item.OrderBy(c=>c));
if (!dic.ContainsKey(key))
{
dic.Add(key, item);
}
}
return dic.Values.ToList();
}
Remove duplicates from a listint
Using the list::remove_if
member function, a temporary hashed set, and lambda expression.
std::list<int> l;
std::unordered_set<int> s;
l.remove_if([&](int n) {
return (s.find(n) == s.end()) ? (s.insert(n), false) : true;
});
Remove duplicates from a ListT in C#
Perhaps you should consider using a HashSet.
From the MSDN link:
using System;
using System.Collections.Generic;
class Program
{
static void Main()
{
HashSet<int> evenNumbers = new HashSet<int>();
HashSet<int> oddNumbers = new HashSet<int>();
for (int i = 0; i < 5; i++)
{
// Populate numbers with just even numbers.
evenNumbers.Add(i * 2);
// Populate oddNumbers with just odd numbers.
oddNumbers.Add((i * 2) + 1);
}
Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
DisplaySet(evenNumbers);
Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
DisplaySet(oddNumbers);
// Create a new HashSet populated with even numbers.
HashSet<int> numbers = new HashSet<int>(evenNumbers);
Console.WriteLine("numbers UnionWith oddNumbers...");
numbers.UnionWith(oddNumbers);
Console.Write("numbers contains {0} elements: ", numbers.Count);
DisplaySet(numbers);
}
private static void DisplaySet(HashSet<int> set)
{
Console.Write("{");
foreach (int i in set)
{
Console.Write(" {0}", i);
}
Console.WriteLine(" }");
}
}
/* This example produces output similar to the following:
* evenNumbers contains 5 elements: { 0 2 4 6 8 }
* oddNumbers contains 5 elements: { 1 3 5 7 9 }
* numbers UnionWith oddNumbers...
* numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
*/
How to remove duplicate List in my ListListint object?
All the arithmetics like 20
, 10 + 10
, 5 + 15
, 2 + 18
and 18 + 2
will be computed at the compile time,
so at the run time you can't distinguish 20
's from one another.
However, you may change design from sums (18 + 2
) into just tems (18, 2
):
// please, notice commas instead of +'s
var lists = new List<List<int>>() {
new List<int> { 20 },
new List<int> { 10, 10 },
new List<int> { 5, 15 },
new List<int> { 2, 18 },
new List<int> { 18, 2 },
};
In this case case you can implement duplicates eliminations
// simplest, providing that list doesn't contain null's
for (int i = 0; i < lists.Count; ++i) {
// since we want to compare sequecnes, we shall ensure the same order of their items
var item = lists[i].OrderBy(x => x).ToArray();
for (int j = lists.Count - 1; j > i; --j)
if (item.SequenceEqual(lists[j].OrderBy(x => x)))
lists.RemoveAt(j);
}
Test
var result = lists.Select(line => string.Join(" + ", line));
Console.Write(string.Join(Environment.NewLine, result));
The output is
20
10 + 10
5 + 15
2 + 18
Most efficient way to remove duplicates from a List
There is a big difference between these two approaches:
List<int> Result1 = new HashSet<int>(myList).ToList(); //3700 ticks
List<int> Result2 = myList.Distinct().ToList(); //4700 ticks
The first one can (will probably) change the order of the elements of the returned List<>
: Result1
elements won't be in the same order of myList
's ones. The second maintains the original ordering.
There is probably no faster way than the first one.
There is probably no "more correct" (for a certain definition of "correct" based on ordering) than the second one.
(the third one is similar to the second one, only slower)
Just out of curiousity, the Distinct()
is:
// Reference source http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,712
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
return DistinctIterator<TSource>(source, null);
}
// Reference source http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,722
static IEnumerable<TSource> DistinctIterator<TSource>(IEnumerable<TSource> source, IEqualityComparer<TSource> comparer) {
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource element in source)
if (set.Add(element)) yield return element;
}
So in the end the Distinct()
simply uses an internal implementation of an HashSet<>
(called Set<>
) to check for the uniqueness of items.
For completeness sake, I'll add a link to the question Does C# Distinct() method keep original ordering of sequence intact?
C# remove duplicates from ListListint
Build custom of EqualityComparer<List<int>>
:
public class CusComparer : IEqualityComparer<List<int>>
{
public bool Equals(List<int> x, List<int> y)
{
return x.SequenceEqual(y);
}
public int GetHashCode(List<int> obj)
{
int hashCode = 0;
for (var index = 0; index < obj.Count; index++)
{
hashCode ^= new {Index = index, Item = obj[index]}.GetHashCode();
}
return hashCode;
}
}
Then you can get the result by using Distinct with custom comparer method:
var result = my_list.Distinct(new CusComparer());
Edit:
Include the index into method GetHashCode
to make sure different orders will not be equal
Removing duplicates in lists
The common approach to get a unique collection of items is to use a set
. Sets are unordered collections of distinct objects. To create a set from any iterable, you can simply pass it to the built-in set()
function. If you later need a real list again, you can similarly pass the set to the list()
function.
The following example should cover whatever you are trying to do:
>>> t = [1, 2, 3, 1, 2, 3, 5, 6, 7, 8]
>>> list(set(t))
[1, 2, 3, 5, 6, 7, 8]
>>> s = [1, 2, 3]
>>> list(set(t) - set(s))
[8, 5, 6, 7]
As you can see from the example result, the original order is not maintained. As mentioned above, sets themselves are unordered collections, so the order is lost. When converting a set back to a list, an arbitrary order is created.
Maintaining order
If order is important to you, then you will have to use a different mechanism. A very common solution for this is to rely on OrderedDict
to keep the order of keys during insertion:
>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]
Starting with Python 3.7, the built-in dictionary is guaranteed to maintain the insertion order as well, so you can also use that directly if you are on Python 3.7 or later (or CPython 3.6):
>>> list(dict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]
Note that this may have some overhead of creating a dictionary first, and then creating a list from it. If you don’t actually need to preserve the order, you’re often better off using a set, especially because it gives you a lot more operations to work with. Check out this question for more details and alternative ways to preserve the order when removing duplicates.
Finally note that both the set
as well as the OrderedDict
/dict
solutions require your items to be hashable. This usually means that they have to be immutable. If you have to deal with items that are not hashable (e.g. list objects), then you will have to use a slow approach in which you will basically have to compare every item with every other item in a nested loop.
Related Topics
Delete Pointer to Multidimensional Array in Class Through Another Pointer - How
Is It Legal to Write to Std::String
Why Does My Cout Output Not Appear Immediately
What Is the Motivation Behind Static Polymorphism in C++
Outputting More Things Than a Polymorphic Text Archive
Sigkill While Allocating Memory in C++
C++11: Why Does Std::Condition_Variable Use Std::Unique_Lock
How to Print Stack Trace for Caught Exceptions in C++ & Code Injection in C++
Print Out All Combinations of Index
How to Overload Operator==() for a Pointer to the Class
How to Check Deallocation of Memory
How to Initialize 'Std::Function' with a Member-Function
Most Efficient Way to Check If All _M128I Components Are 0 [Using <= Sse4.1 Intrinsics]