Find Duplicate Element in Array in Time O(N)

Find duplicate element in array in time O(n)

We have the original array int A[N]; Create a second array bool B[N] too, of type bool=false. Iterate the first array and set B[A[i]]=true if was false, else bing!

How to find the duplicates in an array in O(1) time in Java?

It is mathematically impossible to find duplicates in O(1). You have to examine all N elements of the array at least once to test if each one is a duplicate. That is at least N operations, so the lower bound on the complexity is O(N).

Hint: you can do it in O(N) if you use (say) a HashSet to record each value that you have already seen. The snag is that a HashSet is a space-hungry data structure.


Please provide suggestions/alternate methods to sort an array, I have encountered this problem many times.

The simply way to sort an array of integers is to use Arrays::sort(int[]). That will be O(NlogN).

It is theoretically possible to sort an integer array in better than O(NlogN), but only if you can place a bound on the range of the integer. Lookup up counting sort. The complexity is O(max(N, R) where R is the difference between the smallest and largest numbers. The catch is that O(R) could be much larger than O(N) ... depending on the inputs.

But if you know that M is likely to be less than NlogN, you can use a variant of count sorting and use only O(M) bits of extra space to de-duplicate the array in O(max(M, N)). (I will leave you to figure out the details.)

Finding duplicate elements in an unsorted array of structs in less then O(n^2) - C

Probably a simple solution: Sorting the array is O(n*log(n)), and finding duplicate entries then is a single loop of complexity O(n). So all together, a complexity of O(n*log(n)), which is less than the O(n^2) you wanted to beat. Hope it helps.

Finding the duplicate element in a (sorted) Array in less than O(n) time?

Binary search only works if you know the element you're looking for (or, more exactly, if you can tell which half it's in when you've chosen your pivot point).

There is nothing in your question that seems to indicate you have this knowledge so, based on that, O(n) is the best you can currently do.

If there was some extra information, it may be possible, things like all the numbers in a range being represented except one, or the duplicate being in a specific range.

However, based on current information, that's not the case.

Find duplicate in array - Time complexity O(n^2) and constant extra space O(1). (Amazon Interview)

Well it's a binary search. You are cutting the search space in half and repeating.

Think about it this way: you have a list of 101 items, and you know it contains values 1-100. Take the halfway point, 50. Count how many items are less than or equal to 50. If there are more than 50 items that are less than or equal to 50, then the duplicate is in the range 0-50, otherwise the duplicate is in the range 51-100.

Binary search is simply cutting the range in half. Looking at 0-50, taking midpoint 25 and repeating.


The crucial part of this algorithm which I believe is causing confusion is the for loop. I'll attempt to explain it. Firstly note that there is no usage of indices anywhere in this algorithm - just inspect the code and you'll see that index references do not exist. Secondly, note that the algorithm loops through the entire collection for each iteration of the while loop.

Let me make the following change, then consider the value of inspection_count after every while loop.

inspection_count=0
for i in nums:
inspection_count+=1
if i <= mid:
count+=1

Of course inspection_count will be equal to len(nums). The for loop iterates the entire collection, and for every element checks to see whether it is within the candidate range (of values, not indices).

The duplication test itself is simple and elegant - as others pointed out, this is the pigeonhole principle. Given a collection of n values where every value is in the range {p..q}, if q-p < n then there must be duplicates in the range. Think of some easy cases -

p = 0, q = 5, n = 10
"I have ten values, and every value is between zero and five.
At least one of these values must be duplicated."

We can generalize this, but a more valid and relevant example is

p = 50, q = 99, n = 50
"I have a collection of fifty values, and every value is between fifty and ninety-nine.
There are only forty nine *distinct* values in my collection.
Therefore there is a duplicate."

Reduce the time complexity of the code to find duplicates in an Array from N*N

he went on to ask me if there is any better way i answered add the elements to a set and compare the size if the size of set is less than than of the array it contains duplicates , asked what is the complexity of it guess what its still NN because of the code that adds elements to the set will have to first see if its in the set already

That's wrong. If you are adding the elements to a HashSet, it takes expected O(1) time to add each element (which includes checking if the element is already present), since all you have to do is compute the hashCode to locate the bin that may contain the element (takes constant time), and then search the elements stored in that bin (which also takes expected constant time, assuming the average number of elements in each bin is bound by a constant).

Therefore the total running time is O(N), and there's nothing to improve (you can't find duplicates in less than O(N)).



Related Topics



Leave a reply



Submit