How does Ruby's sort_by {rand} work?
In Ruby 1.8/1.9 both sort
and sort_by
are implemented in C, this is a rough equivalent of how this works:
Say you start with [1,2,3,4]
and call sort_by{rand}
:
(I invented some random numbers):
An array of tuples is created:
[[0.12232, 1],[0.53434, 2],[0.333, 3],[0.99, 4]]
In roughly equivalent Ruby code this is:
[1,2,3,4].map{|x| [rand, x]}
Ruby's quick sort is performed on the array based off the first element: (note the internal implementation is far from trivial and contains a ton of optimisations for already ordered arrays and such)
[[0.12232, 1],[0.333, 3],[0.53434, 2],[0.99, 4]]
In rough Ruby this step is:
ary.sort{|x,y| x[0] <=> y[0]}
Pointers are copied from the new sorted array, to the correct position in the original array.
[1,3,2,4]
In rough Ruby this step is:
ary.map{|x,y| y}
This technique is sometimes referred to as a "Schwartzian Transform". Caching means that the expensive operation is not performed more than N times. Meaning, this is a very efficient way of randomizing an array.
Note: array.shuffle!
will be the most efficient built-in way to shuffle an array (in-place) since it uses a modern version of Fisher-Yates:
static VALUE
rb_ary_shuffle_bang(VALUE ary)
{
long i = RARRAY_LEN(ary);
rb_ary_modify(ary);
while (i) {
long j = rb_genrand_real()*i;
VALUE tmp = RARRAY_PTR(ary)[--i];
RARRAY_PTR(ary)[i] = RARRAY_PTR(ary)[j];
RARRAY_PTR(ary)[j] = tmp;
}
return ary;
}
How to save random value of seed random sort_by {rand} into list in ruby?
You need to save the random numbers into an array first:
rows = ["hello", "nihao", "konnichiwa", "hallo"]
myrand = Random.new(100) # seed - 100
rows.map { |x| [myrand.rand, x] }.sort!
=> [[0.27836938509379616, "nihao"], [0.4245175907491331, "konnichiwa"], [0.5434049417909654, "hello"], [0.8447761323199037, "hallo"]]
The above shown result should always be the same, although this is Random
, since the seed is known - for Random.new(100)
these will always be the first four numbers generated.
You can, technically do exactly what you ask by doing:
rows = ["hello", "nihao", "konnichiwa", "hallo"]
myrand = Random.new(100) # seed - 100
rows = rows.map { |x| [myrand.rand, x] }
=> [[0.5434049417909654, "hello"], [0.27836938509379616, "nihao"], [0.4245175907491331, "konnichiwa"], [0.8447761323199037, "hallo"]]
myrand = Random.new(100) # resetting back to seed - 100
rows.sort_by!{ myrand.rand }
=> [[0.27836938509379616, "nihao"], [0.4245175907491331, "konnichiwa"], [0.5434049417909654, "hello"], [0.8447761323199037, "hallo"]]
How to randomly sort (scramble) an array in Ruby?
Built in now:
[1,2,3,4].shuffle => [2, 1, 3, 4]
[1,2,3,4].shuffle => [1, 3, 2, 4]
How does `sort_by` work with multiple fields?
The problem here is nothing particularly about multiple comparison. If you compare arrays, each of the elements will be compared to the corresponding one in another array. If those values sometimes take numerical values, then they always have to be numerals. Numerals cannot be compared to nil
, and doing so will raise an error. Defaulting them to zero is to ensure they are numerals. As long as you default them to a numeral, it will not raise an error. The particular choice of zero was based on where you want to position the entries with the missing values; you could have chosen infinity, negative infinity, etc for different results.
Ruby - return an array in random order
If you don't have [].shuffle, [].sort_by{rand} works as pointed out by sepp2k. .sort_by temporarily replaces each element by something for the purpose of sorting, in this case, a random number.
[].sort{rand-0.5} however, won't properly shuffle. Some languages (e.g. some Javascript implementations) don't properly shuffle arrays if you do a random sort on the array, with sometimes rather public consequences.
JS Analysis (with graphs!): http://www.robweir.com/blog/2010/02/microsoft-random-browser-ballot.html
Ruby is no different! It has the same problem. :)
#sort a bunch of small arrays by rand-0.5
a=[]
100000.times{a << [0,1,2,3,4].sort{rand-0.5}}
#count how many times each number occurs in each position
b=[]
a.each do |x|
x.each_index do |i|
b[i] ||=[]
b[i][x[i]] ||= 0
b[i][x[i]] += 1
end
end
p b
=>
[[22336, 18872, 14814, 21645, 22333],
[17827, 25005, 20418, 18932, 17818],
[19665, 15726, 29575, 15522, 19512],
[18075, 18785, 20283, 24931, 17926],
[22097, 21612, 14910, 18970, 22411]]
Each element should occur in each position about 20000 times. [].sort_by(rand) gives much better results.
#sort with elements first mapped to random numbers
a=[]
100000.times{a << [0,1,2,3,4].sort_by{rand}}
#count how many times each number occurs in each position
...
=>
[[19913, 20074, 20148, 19974, 19891],
[19975, 19918, 20024, 20030, 20053],
[20028, 20061, 19914, 20088, 19909],
[20099, 19882, 19871, 19965, 20183],
[19985, 20065, 20043, 19943, 19964]]
Similarly for [].shuffle (which is probably fastest)
[[20011, 19881, 20222, 19961, 19925],
[19966, 20199, 20015, 19880, 19940],
[20062, 19894, 20065, 19965, 20014],
[19970, 20064, 19851, 20043, 20072],
[19991, 19962, 19847, 20151, 20049]]
Podium Style Sorting in Ruby
arr.sort_by{|a| a['created_at']}.inject([]){ |r, e| r.reverse << e }
Fun problem!
How to sort an array in descending order in Ruby
It's always enlightening to do a benchmark on the various suggested answers. Here's what I found out:
#!/usr/bin/ruby
require 'benchmark'
ary = []
1000.times {
ary << {:bar => rand(1000)}
}
n = 500
Benchmark.bm(20) do |x|
x.report("sort") { n.times { ary.sort{ |a,b| b[:bar] <=> a[:bar] } } }
x.report("sort reverse") { n.times { ary.sort{ |a,b| a[:bar] <=> b[:bar] }.reverse } }
x.report("sort_by -a[:bar]") { n.times { ary.sort_by{ |a| -a[:bar] } } }
x.report("sort_by a[:bar]*-1") { n.times { ary.sort_by{ |a| a[:bar]*-1 } } }
x.report("sort_by.reverse!") { n.times { ary.sort_by{ |a| a[:bar] }.reverse } }
end
user system total real
sort 3.960000 0.010000 3.970000 ( 3.990886)
sort reverse 4.040000 0.000000 4.040000 ( 4.038849)
sort_by -a[:bar] 0.690000 0.000000 0.690000 ( 0.692080)
sort_by a[:bar]*-1 0.700000 0.000000 0.700000 ( 0.699735)
sort_by.reverse! 0.650000 0.000000 0.650000 ( 0.654447)
I think it's interesting that @Pablo's sort_by{...}.reverse!
is fastest. Before running the test I thought it would be slower than "-a[:bar]
" but negating the value turns out to take longer than it does to reverse the entire array in one pass. It's not much of a difference, but every little speed-up helps.
Please note that these results are different in Ruby 1.9
Here are results for Ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin10.8.0]:
user system total real
sort 1.340000 0.010000 1.350000 ( 1.346331)
sort reverse 1.300000 0.000000 1.300000 ( 1.310446)
sort_by -a[:bar] 0.430000 0.000000 0.430000 ( 0.429606)
sort_by a[:bar]*-1 0.420000 0.000000 0.420000 ( 0.414383)
sort_by.reverse! 0.400000 0.000000 0.400000 ( 0.401275)
These are on an old MacBook Pro. Newer, or faster machines, will have lower values, but the relative differences will remain.
Here's a bit updated version on newer hardware and the 2.1.1 version of Ruby:
#!/usr/bin/ruby
require 'benchmark'
puts "Running Ruby #{RUBY_VERSION}"
ary = []
1000.times {
ary << {:bar => rand(1000)}
}
n = 500
puts "n=#{n}"
Benchmark.bm(20) do |x|
x.report("sort") { n.times { ary.dup.sort{ |a,b| b[:bar] <=> a[:bar] } } }
x.report("sort reverse") { n.times { ary.dup.sort{ |a,b| a[:bar] <=> b[:bar] }.reverse } }
x.report("sort_by -a[:bar]") { n.times { ary.dup.sort_by{ |a| -a[:bar] } } }
x.report("sort_by a[:bar]*-1") { n.times { ary.dup.sort_by{ |a| a[:bar]*-1 } } }
x.report("sort_by.reverse") { n.times { ary.dup.sort_by{ |a| a[:bar] }.reverse } }
x.report("sort_by.reverse!") { n.times { ary.dup.sort_by{ |a| a[:bar] }.reverse! } }
end
# >> Running Ruby 2.1.1
# >> n=500
# >> user system total real
# >> sort 0.670000 0.000000 0.670000 ( 0.667754)
# >> sort reverse 0.650000 0.000000 0.650000 ( 0.655582)
# >> sort_by -a[:bar] 0.260000 0.010000 0.270000 ( 0.255919)
# >> sort_by a[:bar]*-1 0.250000 0.000000 0.250000 ( 0.258924)
# >> sort_by.reverse 0.250000 0.000000 0.250000 ( 0.245179)
# >> sort_by.reverse! 0.240000 0.000000 0.240000 ( 0.242340)
New results running the above code using Ruby 2.2.1 on a more recent Macbook Pro. Again, the exact numbers aren't important, it's their relationships:
Running Ruby 2.2.1
n=500
user system total real
sort 0.650000 0.000000 0.650000 ( 0.653191)
sort reverse 0.650000 0.000000 0.650000 ( 0.648761)
sort_by -a[:bar] 0.240000 0.010000 0.250000 ( 0.245193)
sort_by a[:bar]*-1 0.240000 0.000000 0.240000 ( 0.240541)
sort_by.reverse 0.230000 0.000000 0.230000 ( 0.228571)
sort_by.reverse! 0.230000 0.000000 0.230000 ( 0.230040)
Updated for Ruby 2.7.1 on a Mid-2015 MacBook Pro:
Running Ruby 2.7.1
n=500
user system total real
sort 0.494707 0.003662 0.498369 ( 0.501064)
sort reverse 0.480181 0.005186 0.485367 ( 0.487972)
sort_by -a[:bar] 0.121521 0.003781 0.125302 ( 0.126557)
sort_by a[:bar]*-1 0.115097 0.003931 0.119028 ( 0.122991)
sort_by.reverse 0.110459 0.003414 0.113873 ( 0.114443)
sort_by.reverse! 0.108997 0.001631 0.110628 ( 0.111532)
...the reverse method doesn't actually return a reversed array - it returns an enumerator that just starts at the end and works backwards.
The source for Array#reverse
is:
static VALUE
rb_ary_reverse_m(VALUE ary)
{
long len = RARRAY_LEN(ary);
VALUE dup = rb_ary_new2(len);
if (len > 0) {
const VALUE *p1 = RARRAY_CONST_PTR_TRANSIENT(ary);
VALUE *p2 = (VALUE *)RARRAY_CONST_PTR_TRANSIENT(dup) + len - 1;
do *p2-- = *p1++; while (--len > 0);
}
ARY_SET_LEN(dup, RARRAY_LEN(ary));
return dup;
}
do *p2-- = *p1++; while (--len > 0);
is copying the pointers to the elements in reverse order if I remember my C correctly, so the array is reversed.
incorrect use of ruby's sort_by function
Use
ALL_VIEWS.replace ALL_VIEWS.sort_by {|view| view.permalink.length }
for the first line. sort_by
does not sort in place - it returns the sorted array and does not modify the original one.
Related Topics
How to Solve Insecure World Writable Dir /Usr in Path,Mode 040777 Warning on Ruby
Difference Between Integer(Value) and Value.To_I
Convert String into Array in Rails
Ruby Minitest: Suite- or Class- Level Setup
In Ruby, Is There No Way to Dynamically Define a Local Variable in the Current Context
Sass/Compass Compile into Many Locations
Sass::Syntaxerror: File to Import Not Found or Unreadable: Compass in Production
How to Run a .Rb File from Irb
Name of This Month (Date.Today.Month as Name)
How to Compile Ruby to Byte Code as with Python
What Is '-Mix' in a Ruby Regular Expression
How to Get Past "Http://Gems.Rubyforge.Org/ Does Not Appear to Be a Repository" Error Message
How to Fire Raw Mongodb Queries Directly in Ruby
Dry Way to Assign Hash Values to an Object
Ruby Ssl Error - Sslv3 Alert Unexpected Message
Rails 3.2 'Link_To' (In Email) with 'Method: :Put' Still Producing Get Request