Quick Sort in Ruby Language

Quick Sort Algorithm in Ruby - Stack Level Too Deep

If we throw a debugging p "Left #{left}, Right #{right}...

def quicksort(arr = [], left = 0, right = 0)
  right = arr.length - 1 if right < 1
  if left < right
    index = partition(arr, left, right)
    quicksort(arr, left, index - 1)
    ...

We find there's a problem. left is never set. It's always the default of 0. And right is doing its own thing.

"Left 0, Right 10"
"Left 0, Right 8"
"Left 0, Right 2"
"Left 0, Right 1"
"Left 0, Right 10"
"Left 0, Right 7"
"Left 0, Right 6"
"Left 0, Right 5"
"Left 0, Right 2"
"Left 0, Right 1"
"Left 0, Right 10"
"Left 0, Right 4"
"Left 0, Right 2"
"Left 0, Right 1"
"Left 0, Right 10"
"Left 0, Right 5"
"Left 0, Right 2"
"Left 0, Right 1"
"Left 0, Right 10"
"Left 0, Right 5"
"Left 0, Right 2"
"Left 0, Right 1"

The problem is right = arr.length - 1 if right < 1. If right is ever < 1 it's set back to the end of the array. left is always 0 so left is always less than right. quicksort(arr, 0, index - 1) is recursed into over and over again. quicksort(arr, index, right) is never reached.

Your partition is fine, and good eye noticing pivot can be calculated inside pivot.

What tripped you up is those defaults. You're setting a default for right any time it's less than 1. But it should only be set if it isn't passed in at all.

def quicksort(arr, left = 0, right = arr.length - 1)
  if left < right
    index = partition(arr, left, right)
    quicksort(arr, left, index - 1)
    quicksort(arr, index, right)
  end
  arr
end

Now quicksort(array) is equivalent to quicksort(array, 0, array.length - 1). Subsequent recursive calls pass left and right in, so no need for defaults.

And no default for the array. If they forget to pass an array in that should be an ArgumentError.

I prefer the public wrapper approach used in the video. If someone accidentally passes in too many arguments they get a clear ArgumentError rather than something weird happening.

# Using the ! convention for functions which alter their arguments.
def quicksort!(array)
  _quicksort!(array, 0, array.length - 1)
end

private def _quicksort!(array, left, right)
  ...
end

In ruby, Is it better to receive *array or to duplicate and array inside a method?

In terms of performance, quick_sort6 is your best bet. Using some random data:

require 'benchmark'

def quick_sort3(ary)
  return ary if ary.size <= 1
  left,right = ary.partition { |v| v < ary.last }
  pivot_value = right.pop
  quick_sort3(left) + [pivot_value] +  quick_sort3(right)
end

def quick_sort6(*ary)
  return ary if ary.empty?
  pivot_value = ary.delete_at(rand(ary.size))
  left,right = ary.partition { |v| v < pivot_value }
  return *quick_sort6(*left), pivot_value, *quick_sort6(*right)
end

def quick_sort4(ary)
  return ary if ary.size <= 1
  pivot_value = ary.delete_at(rand(ary.size))
  left,right = ary.partition { |v| v < pivot_value }
  quick_sort4(left) + [pivot_value] +  quick_sort4(right)
end

def quick_sort5(ary_in)
  return ary_in if ary_in.size <= 1
  ary = ary_in.dup
  pivot_value = ary.delete_at(rand(ary.size))
  left,right = ary.partition { |v| v < pivot_value }
  quick_sort5(left) + [pivot_value] +  quick_sort5(right)
end

random_arrays = Array.new(5000) do
  Array.new(500) { rand(1...500) }.uniq
end

Benchmark.bm do |benchmark|
  benchmark.report("quick_sort3") do
    random_arrays.each do |ra|
      quick_sort3(ra.dup)
    end
  end
  benchmark.report("quick_sort6") do
    random_arrays.each do |ra|
      quick_sort6(ra.dup)
    end
  end
  benchmark.report("quick_sort4") do
    random_arrays.each do |ra|
      quick_sort4(ra.dup)
    end
  end
  benchmark.report("quick_sort5") do
    random_arrays.each do |ra|
      quick_sort5(ra.dup)
    end
  end
end

Gives as result

       user     system      total        real
quick_sort3  1.389173   0.019380   1.408553 (  1.411771)
quick_sort6  0.004399   0.000022   0.004421 (  0.004487)
quick_sort4  1.208003   0.002573   1.210576 (  1.214131)
quick_sort5  1.458327   0.000867   1.459194 (  1.459882)

Tail recursion in ruby - what's the difference between these two implementations?

Neither of the methods you've shown are tail recursive (well technically the first one is half tail-recursive: the second recursive call is a tail call, but the first one is not - the second method is not tail recursive at all).

The reason that the first method overflows the stack, but the second one does not is that the first method recurses much deeper than the second (linearly instead of logarithmically) because of a bug (++m just applies the unary + operator to m twice - it does not actually do anything to m).

When given a large enough array both versions will overflow (and would do so even if ruby did perform TCO), but without the bug 10000 elements is not nearly large enough.

Implementing quicksort in a functional language

So, where exactly is the partitioning happening? Or am I thinking about SMLs quicksort wrongly?

A purely functional implementation of quicksort works by structural recursion on the input list (IMO, this is worth mentioning). Moreover, as you see, the two calls to "filt" allow you to partition the input list into two sublists (say A and B), which then can be treated individually. What is important here is that:

all elements of A are less than or equal to the pivot element ("p" in the code)
all elements of B are greater than the pivot element

An imperative implementation works in-place, by swapping elements in the same array. In the pseudocode you've provided, the post-invariant of the "partition" function is that you have two subarrays, one starting at 'left' of input array (and ending at 'pivotIndex'), and another starting right after 'pivotIndex' and ending at 'right'. What is important here is that the two subarrays can be seen as representations of the sublists A and B.

I think that by now, you have the idea where the partitioning step is happening (or conversely, how the imperative and functional are related).

How is Array#sort in Ruby so quick?

Array#sort is implemented in C, see rb_ary_sort in array.c

It also has some checks to compare Fixnums so sorting an array of Integers doesn't even need method lookups.

quicksort implementation: stack level too deep (SystemStackError)

Your algorithm is working for me, even out to 1e7 when I generate an array for testing.

 quicksort Range.new(1,1e7).to_a.shuffle

Granted, that required about 4.5 GB of RAM to finish. As far as cleaning up the output...

ruby-1.9.3-p0 :018 >       quicksort [1,3,2] # => [1, 2, [], 3, []] 
ruby-1.9.3-p0 :019 >     quicksort [1,4,2,3] # => [1, 2, [3, [4]]]

Change this line:

  return (quicksort(less) << pivot << quicksort(greater)).flatten.compact

And it makes it all much cleaner.

ruby-1.9.3-p0 :037 >       quicksort [1,3,2] # => [1, 2, 3] 
ruby-1.9.3-p0 :038 >     quicksort [1,4,2,3] # => [1, 2, 3, 4]