How to Remove Duplicate Values from a List in C++

Remove duplicates from a ListT in C#

Perhaps you should consider using a HashSet.

From the MSDN link:

using System;
using System.Collections.Generic;

class Program
{
static void Main()
{
HashSet<int> evenNumbers = new HashSet<int>();
HashSet<int> oddNumbers = new HashSet<int>();

for (int i = 0; i < 5; i++)
{
// Populate numbers with just even numbers.
evenNumbers.Add(i * 2);

// Populate oddNumbers with just odd numbers.
oddNumbers.Add((i * 2) + 1);
}

Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
DisplaySet(evenNumbers);

Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
DisplaySet(oddNumbers);

// Create a new HashSet populated with even numbers.
HashSet<int> numbers = new HashSet<int>(evenNumbers);
Console.WriteLine("numbers UnionWith oddNumbers...");
numbers.UnionWith(oddNumbers);

Console.Write("numbers contains {0} elements: ", numbers.Count);
DisplaySet(numbers);
}

private static void DisplaySet(HashSet<int> set)
{
Console.Write("{");
foreach (int i in set)
{
Console.Write(" {0}", i);
}
Console.WriteLine(" }");
}
}

/* This example produces output similar to the following:
* evenNumbers contains 5 elements: { 0 2 4 6 8 }
* oddNumbers contains 5 elements: { 1 3 5 7 9 }
* numbers UnionWith oddNumbers...
* numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
*/

I have a linked list, I would like to remove duplicate values

There are two approaches that come to mind right away, and which one to use depends on your specific case, and whether you'd like to preserve the original order of elements or not.


1st approach:

Use a double loop and pick up a node at a time. Then iterate the list after that node, and if you find a duplicate, remove. Repeat picking nodes, until you have iterated over the whole list.

For every node of the list
For every next_node after node
If next_node.value == node.value
Remove that next_node

This approach preserves the original order of the elements.

This approach is I think what you'd have already in mind. I suggest you start with that.

Example:

1 -> 2 -> 3 -> 4 -> 1

I will start from the first node (1), check the second node, the third, the fourth, nothing so far, no duplicates found. I now check the fifth node, which also has the value 1 (duplicate found!), so I remove it.

Now the list looks like this:

1 -> 2 -> 3 -> 4

Now I am looking of duplicates of the second node (I checked for the first node in the previous traversal). I check 3, I check 4, no duplicates found. The list remains the same.

Now I am looking of duplicates of the third node. I check 4, no duplicates found. The list remains the same.

Now I am looking of duplicates of the fourth node. The next node is NULL, which means that the fourth node is the last node (because we removed the fifth node in the first traversal, as a duplicate of 1). There is nothing to check then, the list remains the same:

1 -> 2 -> 3 -> 4

Observe how, for every node that I want to check if duplicates exist, I traverse the list until its end. So, for every node, I am doing an O(N) traversal, where N is the size of the list.

How many nodes do I have? N

So the Time Complexity of this approach is N * O(N) = O(N2)

I strongly recommend trying that out yourself, and practice. When you are done, you could read the Remove duplicates from an unsorted list to check your solution.


2nd approach:

Sort the list, and now the list will have the duplicate values grouped together. So, if there is a duplicate of the current node, it will be its next node. If it's a duplicate, remove next node.

Now, again, if there is a duplicate of the current node, it will be its next node. So do the same above, until the next node is not a duplicate of the current node.

Then, make next node the current node, and do the same process.

Sort list
current_node = head_node
While current_node != NULL
If current_node.value == current_node.next.value
Remove current_node.next
Else
current_node = current_node.next

This approach does not preserve the original order of the elements.

Same Example:

1 -> 2 -> 3 -> 4 -> 1

Sort the list:

1 -> 1 -> 2 -> 3 -> 4

I start from 1. I check its next node, it's also 1, a duplicate found! Remove the next node. Now the list is:

1 -> 2 -> 3 -> 4

Current node is still 1. I check its next node, it's 2. Not a duplicate. List remains the same. Set next node as the current node.

Current node is 2. Check its next node, it's 3, not a duplicate. List remains the same. Set next node as the current node.

Current node is 3. Check its next node, it's 4, not a duplicate. List remains the same. Set next node as the current node.

Current node is 4. It doesn't have a next node, nothing to check, I am done. List remains the same:

1 -> 2 -> 3 -> 4

Observe that for every node, I check only its immediate next node(s). And then, I continue for the last next node checked. That is O(N).

However, I had to sort the list, in order to ensure that the duplicates are grouped. Sorting a list can be done in O(NlogN).

The Time Complexity is O(NlogN) + O(N) = O(NlogN).

I had used Merge Sort to sort the list in C. There is also the Remove duplicates from sorted list for another explanation.

Remove duplicate items from list in C#, according to one of their properties

I think this would work:

var result = myClassObject.GroupBy(x => x.BillId)
.Where(x => x.Count() == 1)
.Select(x => x.First());

Fiddle here

Remove duplicates from a listint

Using the list::remove_if member function, a temporary hashed set, and lambda expression.

std::list<int> l;
std::unordered_set<int> s;

l.remove_if([&](int n) {
return (s.find(n) == s.end()) ? (s.insert(n), false) : true;
});

Recursively removing the duplicate elements in a linked list

There is no need for recursion in this function. Just iterate in a loop either removing the next element or skipping to the next element:

Node *deleteDuplicates(Node *head) {
if (head != NULL) {
Node *p = head;
while (p->next) {
if (p->next->data == p->data) {
Node *x = p->next;
p->next = x->next;
free(x);
} else {
p = p->next;
}
}
}
return head;
}

You could fix your recursive function, but it should be modified to not return the head node as this prevents tail recursion, therefore requiring a potentially huge amount of stack space. A sufficiently long list would cause a Stack overflow.

Here is a modified recursive function:

void deleteDuplicates(Node *head) { 
if (head != NULL && head->next != NULL) {
if (head->data == head->next->data) {
struct Node *x = head->next;
head->next = x->next;
free(x);
deleteDuplicates(head);
} else {
deleteDuplicates(head->next);
}
}
}

The problem in your code is you store the return value of deleteDuplicates into your head pointer, but the function returns the pointer to the last node in the list, not the head node.



Related Topics



Leave a reply



Submit