Remove duplicate elements from array in Ruby
array = array.uniq
uniq
removes all duplicate elements and retains all unique elements in the array.
This is one of many beauties of the Ruby language.
Remove duplicate from an array in ruby
To just remove duplicates based on :name, simply try;
output = input.uniq { |x| x[:name] }
Demo here.
Edit: Since you added a sorting requirement in the comments, here's how to select the entry with the highest score for every name if you're using Rails, I see you already got an answer for "standard" Ruby above;
output = input.group_by { |x| x[:name] }
.map {|x,y|y.max_by {|x|x[:score]}}
A little explanation may be in order; the first line groups
the entries by name so that each name gets its own array of entries. The second line goes through the groups, name by name, and maps
each name group to the entry with the highest score.
Demo here.
Ruby: Remove all instances of a duplicate value inside an array
p arr.group_by(&:itself).reject{|k,v|v.count>1}.keys
Output
[2, 3, 5]
Remove duplicates from array in Ruby and perform an operation on a specific index
arr.each_with_object(Hash.new(0)) { |(*a,n),h| h[a] += n }.map(&:flatten)
#=> [["A", "Red", 15], ["B", "Red", 3], ["B", "Blue", 5], ["C", "Blue", 3],
# ["C", "Black", 1], ["D", nil, 9]]
The first step of the calculation is:
h = arr.each_with_object(Hash.new(0)) { |(*a,n),h| h[a] += n }
#=> {["A", "Red"]=>15, ["B", "Red"]=>3, ["B", "Blue"]=>5,
# ["C", "Blue"]=>3, ["C", "Black"]=>1, ["D", nil]=>9}
This uses the form of Hash::new that takes an argument called the default value. All that means is that when Ruby's parser expands h[a] += 1
to
h[a] = h[a] + n
h[a]
on the right returns h
's default value, 0
, if h
does not have a key a
. For example, when h
is empty,
h[["A", "Red"]] = h[["A", "Red"]] + 7 #=> 0 + 7 => 7
h[["A", "Red"]] = h[["A", "Red"]] + 8 #=> 7 + 8 => 15
h
does not have a key ["A", "Red"]
in the first expression, so h[["A", "Red"]]
on the right returns the default value, 0
, whereas h
does have that key in the second expression so the default value does not apply.
h.map(&:flatten)
is shorthand for
h.map { |a| a.flatten }
When the block variable a
is set equal to first key-value pair of h
,
a #=> [["A", "Red"], 15]
So
a.flatten
#=> ["A", "Red", 15]
To understand|(*a,n),h|
we need to construct the enumerator
enum = arr.each_with_object(Hash.new(0))
#=> #<Enumerator: [["A", "Red", 7], ["A", "Red", 8], ["B", "Red", 3],
# ["B", "Blue", 2], ["B", "Blue", 3], ["C", "Blue", 3],
# ["C", "Black", 1], ["D", nil, 4], ["D", nil, 5]]
# :each_with_object({})>
We now generate the first value from the enumerator (using Enumerator#next) and assign values to the block variables:
(*a,n),h = enum.next
#=> [["A", "Red", 7], {}]
a #=> ["A", "Red"]
n # => 7
h #=> {}
The way in which the array returned by enum.next
is broken up into constituent elements that are assigned to the block variables is called array decomposition. It is a powerful and highly useful techique.
Remove duplicates in Ruby Array
The code for most Ruby methods can be found in the ruby-doc.org API documentation. If you mouse over a method's documentation, a "click to toggle source" button appears. The code is in C, but it's very easy to understand.
if (RARRAY_LEN(ary) <= 1)
return rb_ary_dup(ary);
if (rb_block_given_p()) {
hash = ary_make_hash_by(ary);
uniq = rb_hash_values(hash);
}
else {
hash = ary_make_hash(ary);
uniq = rb_hash_values(hash);
}
If there's one element, return it. Otherwise turn the elements into hash keys, turn the hash back into an array. By a documented quirk of Ruby hashes, "Hashes enumerate their values in the order that the corresponding keys were inserted", this technique preserves the original order of the elements in the Array. In other languages it may not.
Alternatively, use a Set. A set will never have duplicates. Loading set
adds the method to_set
to all Enumerable objects, which includes Arrays. However, a Set is usually implemented as a Hash so you're doing the same thing. If you want a unique array, and if you don't need the elements to be ordered, you should probably instead make a set and use that. unique = array.to_set
Alternatively, sort the Array and loop through it pushing each element onto a new Array. If the last element on the new Array matches the current element, discard it.
array = [2, 3, 4, 5, 1, 2, 4, 5];
uniq = []
# This copies the whole array and the duplicates, wasting
# memory. And sort is O(nlogn).
array.sort.each { |e|
uniq.push(e) if e != uniq[-1]
}
[1, 2, 3, 4, 5]
puts uniq.inspect
This method is to be avoided because it is slower and takes more memory than the other methods. The sort makes it slower. Sorting is O(nlogn) meaning as the array gets bigger sorting will get slower quicker than the array grows. It also requires you to copy the whole array, with duplicates, unless you want to alter the original data by sorting in place with sort!
.
The other methods are O(n) speed and O(n) memory meaning they will scale linearly as the array gets bigger. And they don't have to copy the duplicates which can use substantially less memory.
How to remove duplicate entries from array of arrays based on nested value?
Use uniq
method! (ruby >= 1.9.2)
array = [
["John, Doe", "/manager/consumer/123456?status=1", {:data=>{:id=>123456}, :class=>""}],
["Jane, smith", "/manager/consumer/7891011?status=1", {:data=>{:id=>7891011}, :class=>""}],
["William, Smith", "/manager/consumer/12131415?status=1", {:data=>{:id=>1211415}, :class=>""}],
["John, Doe", "/manager/consumer/123456?status=1", {:data=>{:id=>123456}, :class=>""}]
]
array.uniq { |_name, _url, hash| hash[:data][:id] }
In case of duplicate of an id it will remove all but the first entry, so you need to think about a situation when the id is the same but rest of the data is not.
NOTE: if you for some reason are running on ruby before 1.9.2, then uniq
will ignore the block. For that reason ActiveSupport had uniq_by
method (which was removed in version 4.0.2).
How can I remove duplicates in an array without using `uniq`?
the problem is that the inner loop is an infinite loop:
while true
sorted.delete_if {|i| i = i + count}
count += 1
end #while
you can probably do what you are doing but it's not eliminating duplicates.
one way to do this would be:
numbers = [1, 4, 2, 4, 3, 1, 5]
target = []
numbers.each {|x| target << x unless target.include?(x) }
puts target.inspect
to add it to the array class:
class ::Array
def my_uniq
target = []
self.each {|x| target << x unless target.include?(x) }
target
end
end
now you can do:
numbers = [1, 4, 2, 4, 3, 1, 5]
numbers.my_uniq
Related Topics
Chaining Methods Using Symbol#To_Proc Shorthand in Ruby
How to Use Params with Slashes with Sinatra
No Such File or Directory - Git Ls-Files -- Windows
How Is a Local Variable Created Even When If Condition Evaluates to False in Ruby
How to Test If Parameters Exist in Rails
Rspec: "Array.Should == Another_Array" But Without Concern for Order
What's the Difference Between Rspec's Subject and Let? When Should They Be Used or Not
Throw Exception When Re-Assigning a Constant in Ruby
Language in a Sandbox in Rails
How to Make 'Whenever' Gem Work on Windows
Having Trouble Installing Libxml-Ruby on Windows
Inserting an Array Using Sequel Gem in Postgresql
How to Skip Has_Secure_Password Validations