What is stability in sorting algorithms and why is it important?
A sorting algorithm is said to be stable if two objects with equal keys appear in the same order in sorted output as they appear in the input array to be sorted. Some sorting algorithms are stable by nature like Insertion sort, Merge Sort, Bubble Sort, etc. And some sorting algorithms are not, like Heap Sort, Quick Sort, etc.
Background: a "stable" sorting algorithm keeps the items with the same sorting key in order. Suppose we have a list of 5-letter words:
peach
straw
apple
spork
If we sort the list by just the first letter of each word then a stable-sort would produce:
apple
peach
straw
spork
In an unstable sort algorithm, straw
or spork
may be interchanged, but in a stable one, they stay in the same relative positions (that is, since straw
appears before spork
in the input, it also appears before spork
in the output).
We could sort the list of words using this algorithm: stable sorting by column 5, then 4, then 3, then 2, then 1.
In the end, it will be correctly sorted. Convince yourself of that. (by the way, that algorithm is called radix sort)
Now to answer your question, suppose we have a list of first and last names. We are asked to sort "by last name, then by first". We could first sort (stable or unstable) by the first name, then stable sort by the last name. After these sorts, the list is primarily sorted by the last name. However, where last names are the same, the first names are sorted.
You can't stack unstable sorts in the same fashion.
How do I do stable sort?
Put the key that you originally wanted to sort by and the index into an array, and sort by that.
a.sort_by.with_index { |(x, y), i| [y, i] }
# => [[:a, 0], [:c, 0], [:d, 0], [:b, 1]]
How to perform stable sort in C++ when using a custom comparator?
The comparator for sort
and stable_sort
must induce a strict weak order. Your comparator does not satisfy this condition.
One of the properties of a strict weak order is that for any permitted values of i
and j
, at most one of mysort(i,j)
and mysort(j,i)
can return true
. Your comparator returns true
for both cases when, e.g., i=2
and j=4
.
To fix that, you must modify your comparator. When is arg1
"less" than arg2
? There is only one case: When arg1
is even and arg2
is odd. Hence, a usable definition would be:
bool mysort(int arg1,int arg2){
return arg1 % 2 == 0 && arg2 % 2 != 0;
}
BTW, when you define a comparator, you have at no time to think about in which order the algorithm will pass the parameters. All the (earlier, later) discussions are irrelevant. The algorithm will pass parameters in which ever order it likes. There are no guarantees.
How to perform stable sort over multiple columns?
It is a solvable problem. You don't have to throw your hands up and say computers cannot be used to solve problems.
Start with the data, and add a new surrogate "Rank" column.
Date Id Rank
-------------- ---- ----
null 7 null
null 1 null
null 9 null
null 2 null
null 8 null
11/1/2017 null null
11/4/2017 3 null
11/5/2017 null null
11/12/2017 10 null
11/13/2017 null null
Then seed Rank
to the Id
:
UPDATE SortDemo SET Rank = Id
WHERE Id IS NOT NULL
Date Id Rank
-------------- ---- ----
null 7 7
null 1 1
null 9 9
null 2 2
null 8 8
11/1/2017 null null
11/4/2017 3 3
11/5/2017 null null
11/12/2017 10 10
11/13/2017 null null
Then for the remaining items items with a Date, we need to assign it to the "next" rank:
Date Id Rank
-------------- ---- ----
null 7 7
null 1 1
null 9 9
null 2 2
null 8 8
11/1/2017 null null <-- 3
11/4/2017 3 3
11/5/2017 null null <-- 10
11/12/2017 10 10
11/13/2017 null null <-- ?
with
UPDATE SortDemo SET Rank = (
SELECT MIN(Rank) FROM SortDemo s2
WHERE s2.Date >= SortDemo.Date)
WHERE SortDemo.Date IS NOT NULL
Date Id Rank
-------------- ---- ----
null 7 7
null 1 1
null 9 9
null 2 2
null 8 8
11/1/2017 null 3 <--
11/4/2017 3 3
11/5/2017 null 10 <--
11/12/2017 10 10
11/13/2017 null null <-- ?
There's also the edge case where there are no items "after" us; which we can fix by going backwards to one the highest previous rank:
UPDATE SortDemo SET Rank = (
SELECT MAX(Rank) FROM SortDemo s2
WHERE s2.Date <= SortDemo.Date)
WHERE SortDemo.Date IS NOT NULL
Date Id Rank
-------------- ---- ----
null 7 7
null 1 1
null 9 9
null 2 2
null 8 8
11/1/2017 null 3
11/4/2017 3 3
11/5/2017 null 10
11/12/2017 10 10
11/13/2017 null 10 <--10
And we're done
We can now sort by surrogate Rank
, and break ties by sorting by Date
:
SELECT * FROM SortDemo
ORDER BY Rank, Date
Date Id Rank
-------------- ---- ----
null 1 1
null 2 2
11/1/2017 null 3
11/4/2017 3 3
null 7 7
null 8 8
null 9 9
11/5/2017 null 10
11/12/2017 10 10
11/13/2017 null 10
Solution complete. Sql Fiddle
The answer is in escrow until Monday, so i can give people the opportunity to earn reputation for solving a unique problem.
Related Topics
Disable Rdoc and Ri Generation by Default for Rubygems 1.8.X
How to Add Syntax Highlighting to Sublime Text 2
Undefined Method Respond_To in Rails 5 Controller
Install Ruby 2.2 on MAC Osx Catalina with Ruby-Install
How to Quickly Reorder a Ruby Array Given an Order
How to Use Us-Style Dates in Rails Using Ruby 1.9
2 Gems Need Different Versions of the Same Dependency
Rvm Error While Running Make Install. Error Comes While Installing Power_Assert Gem
Error Installing Rubymine, No Sdk Specified, But It Is Listed
How to Get the Destination Url of a Shortened Url Using Ruby
Ruby: What Is the Order of Keys/Values Returned by Hash.Keys and Hash.Values Methods
How to Perform Vector Addition in Ruby
How to Run Rake with --Trace Within Capistrano
Ruby - Extracting the Unique Values Per Key from an Array of Hashes