How can I efficiently extract repeated elements in a Ruby array?
Inspired by Ilya Haykinson's answer:
def repeated(array)
counts = Hash.new(0)
array.each{|val|counts[val]+=1}
counts.reject{|val,count|count==1}.keys
end
How to find and return a duplicate value in array
a = ["A", "B", "C", "B", "A"]
a.detect{ |e| a.count(e) > 1 }
I know this isn't very elegant answer, but I love it. It's beautiful one liner code. And works perfectly fine unless you need to process huge data set.
Looking for faster solution? Here you go!
def find_one_using_hash_map(array)
map = {}
dup = nil
array.each do |v|
map[v] = (map[v] || 0 ) + 1
if map[v] > 1
dup = v
break
end
end
return dup
end
It's linear, O(n), but now needs to manage multiple lines-of-code, needs test cases, etc.
If you need an even faster solution, maybe try C instead.
And here is the gist comparing different solutions: https://gist.github.com/naveed-ahmad/8f0b926ffccf5fbd206a1cc58ce9743e
how to get repeated elements from ruby array?
arr = [1,2,3,1,5,2]
arr.group_by {|e| e}.map { |e| e[0] if e[1][1]}.compact
Pretty ugly... but does the job without an n+1 problem.
Remove duplicate elements from array in Ruby
array = array.uniq
uniq
removes all duplicate elements and retains all unique elements in the array.
This is one of many beauties of the Ruby language.
How do I detect duplicate values within an array in Ruby?
You can create a hash to store number of times any element is repeated. Thus iterating over array just once.
h = Hash.new(0)
['a','b','b','c'].each{ |e| h[e] += 1 }
Should result
{"a"=>1, "b"=>2, "c"=>1}
Fast way to find duplicate in large array
Your code is taking an eon to execute because it is executing count
for each element, resulting in it having a computational complexity of O(n2).
arr = [*1..35000, 1, 34999]
If you want to know which values appear in the array at least twice...
require 'set'
uniq_set = Set.new
arr.each_with_object(Set.new) { |x,dup_set| uniq_set.add?(x) || dup_set.add(x) }.to_a
#=> [1, 34999]
Set lookups (implemented with a hash under the covers) are extremely fast.
See Set#add? and Set#add.
If you want to know the numbers of times values appear in the array that appear at least twice...
arr.each_with_object(Hash.new(0)) { |x,h| h[x] += 1 }.select { |_,v| v > 1 }
#=> {1=>2, 34999=>2}
This uses a counting hash1. See Hash::new when it takes a default value as an argument.
If you want to know the indices of values that appear in the array at least twice...
arr.each_with_index.
with_object({}) { |(x,i),h| (h[x] ||= []) << i }.
select { |_,v| v.size > 1 }
#=> {1=>[0, 35000], 34999=>[34998, 35001]}
When the hash h
does not already have a key x
,
(h[x] ||= []) << i
#=> (h[x] = h[x] || []) << i
#=> (h[x] = nil || []) << i
#=> (h[x] = []) << i
#=> [] << i where [] is now h[x]
1. Ruby v2.7 gave us the method Enumerable#tally, allowing us to write arr.tally.select { |_,v| v > 1 }
.
how to get the indexes of duplicating elements in a ruby array
duplicates = arr.each_with_index.group_by(&:first).inject({}) do |result, (val, group)|
next result if group.length == 1
result.merge val => group.map {|pair| pair[1]}
end
This will return a hash where the keys will be the duplicate elements and the values will be an array containing the index of each occurrence.
For your test input, the result is:
{"A"=>[0, 6], "X"=>[1, 2]}
If all your care about is the indices you can do duplicates.values.flatten
to get an array with just the indices.
In this case: [0, 6, 1, 2]
How to count duplicates in Ruby Arrays
This will yield the duplicate elements as a hash with the number of occurences for each duplicate item. Let the code speak:
#!/usr/bin/env ruby
class Array
# monkey-patched version
def dup_hash
inject(Hash.new(0)) { |h,e| h[e] += 1; h }.select {
|k,v| v > 1 }.inject({}) { |r, e| r[e.first] = e.last; r }
end
end
# unmonkeey'd
def dup_hash(ary)
ary.inject(Hash.new(0)) { |h,e| h[e] += 1; h }.select {
|_k,v| v > 1 }.inject({}) { |r, e| r[e.first] = e.last; r }
end
p dup_hash([1, 2, "a", "a", 4, "a", 2, 1])
# {"a"=>3, 1=>2, 2=>2}
p [1, 2, "Thanks", "You're welcome", "Thanks",
"You're welcome", "Thanks", "You're welcome"].dup_hash
# {"You're welcome"=>3, "Thanks"=>3}
Find a Duplicate in an array Ruby
Array#difference
comes to the rescue yet again. (I confess that @user123's answer is more straightforward, unless you pretend that Array#difference
is already a built-in method. Array#difference
is probably the more efficient of the two, as it avoids the repeated invocations of count
.) See my answer here for a description of the method and links to its use.
In a nutshell, it differs from Array#- as illustrated in the following example:
a = [1,2,3,4,3,2,4,2]
b = [2,3,4,4,4]
a - b #=> [1]
a.difference b #=> [1, 3, 2, 2]
One day I'd like to see it as a built-in.
For the present problem, if:
arr = [1,2,3,4,3,4]
the duplicate elements are given by:
arr.difference(arr.uniq).uniq
#=> [3, 4]
Related Topics
How to Create Automatically a Instance of Every Class in a Directory
Map Array of Ints to Nested Array Access
Ruby on Rails - £ Sign Troubles
How to Update a Model's Attribute with a Virtual Attribute
Copy One Slide from Google Slides into a New Presentation Using API
How to Fill Out Login Form with Mechanize in Ruby
Why Do Numeric String Comparisons Give Unexpected Results
Ruby Facebook Graph API Appsecret_Proof
How to Understand the #Dup and #Clone Operate on Objects Which Referencing Other Objects
Rails 404 Error for Stylesheet or JavaScript Files
Calling Instance Variables Without @
Stubbing Controller Actions in Rspec Request Specs
How to Sort So That "Vitamin B12" Is Not in Front of "Vitamin B6"
Undefined Method Error When Creating Delayed_Job Workers with Script/Delay_Job
Import CSV in Batches of Lines in Rails
How to Mix Required Argument and Optional Arguments in Ruby