Difference Between Sort(), Rank(), and Order()

Difference between sort(), rank(), and order()

sort() sorts the vector in an ascending order.

rank() gives the respective rank of the numbers present in the vector, the smallest number receiving the rank 1.

order() returns the indices of the vector in a sorted order.

for example: if we apply these functions are applied to the vector - c (3, 1, 2, 5, 4)

sort(c (3, 1, 2, 5, 4)) will give c(1,2,3,4,5)

rank(c (3, 1, 2, 5, 4)) will give c(3,1,2,5,4)

order(c (3, 1, 2, 5, 4)) will give c(2,3,1,5,4).
if you put these indices in this order, you will get the sorted vector. Notice how v[2] = 1, v[3] = 2, v[1] = 3, v[5] = 4 and v[4] = 5

also there is a tie handling method in R. If you run rank(c (3, 1, 2, 5, 4, 2)) it will give Rank 1 to 1, since there are two 2 present R will rank them on 2 and 3 but assign Rank 2.5 to each of them, next 3 will get Rank 4.0, so

rank(c (3, 1, 2, 5, 4, 2)) will give you output [4.0 1.0 2.5 6.0 5.0 2.5]

Hope this is helpful.

rank and order in R

set.seed(1)
x <- sample(1:50, 30)
x
# [1] 14 19 28 43 10 41 42 29 27 3 9 7 44 15 48 18 25 33 13 34 47 39 49 4 30 46 1 40 20 8
rank(x)
# [1] 9 12 16 25 7 23 24 17 15 2 6 4 26 10 29 11 14 19 8 20 28 21 30 3 18 27 1 22 13 5
order(x)
# [1] 27 10 24 12 30 11 5 19 1 14 16 2 29 17 9 3 8 25 18 20 22 28 6 7 4 13 26 21 15 23

rank returns a vector with the "rank" of each value. the number in the first position is the 9th lowest. order returns the indices that would put the initial vector x in order.

The 27th value of x is the lowest, so 27 is the first element of order(x) - and if you look at rank(x), the 27th element is 1.

x[order(x)]
# [1] 1 3 4 7 8 9 10 13 14 15 18 19 20 25 27 28 29 30 33 34 39 40 41 42 43 44 46 47 48 49

What's the difference between ordering and sorting?

  • An "ordering" is basically a set of rules that determine what items come before, or after, what other items. IE: the relative order items would appear in if they were sorted. For collections that enforce an ordering, that ordering is generally specified in terms of comparison operators (particularly <), interfaces (a la Java's Comparable<T>), or comparison callbacks and/or function objects (like C++'s std::less).

    There are also collections described as "ordered collections". Slightly different use of the word, but related meaning. What that means is that the collection represents a sequence of items, and thus has some inbuilt concept of order. (Contrast with, say, hash tables. You add something to a hash table, you have no idea where it'll appear if ever you iterate over the contents. With an ordered collection, you know.) Lists, vectors, arrays, etc are the quintessential ordered collections. For a non-list example, though, PHP's "array" type is actually an "ordered map" — a dictionary type where the order of keys is retained. Keys appear in the order in which they were first inserted (or the order in which you last put them with a ksort() or the like) when you iterate over the array.

  • "Sorting" is the process of actually arranging a sequence of items in accordance with a given ordering. It's typically only done with ordered collections...as it doesn't make much sense to put one item before another in a container that has no concept of "before" or won't let you rearrange items in the first place. (Structures like sets and heaps can use an ordering too, and adding and removing entries alters the underlying tree based on the ordering. One could argue they're "sorting" bit by bit. But the word is typically used to represent an operation that does the rearranging all at once.)

Java Comparator sorting order

you need to use sortOrder.equalsIgnoreCase(DESCENDING) in compareTo :

@Override
public int compare(Event e1, Event e2) {
if(sortOrder.equalsIgnoreCase(DESCENDING))
return compareTo(e2, e1);
else
return compareTo(e1, e2);
}

public int compareTo(Event e1, Event e2) {
Tag t1 = getTag(e1, tagName, tagType);
Tag t2 = getTag(e2, tagName, tagType);

if(t1!=null && t2!=null){
if(t1.getRank()==null && t2.getRank()!=null)
return sortOrder.equalsIgnoreCase(DESCENDING) ? 1 : -1;
else if(t2.getRank()==null && t1.getRank()!=null)
return sortOrder.equalsIgnoreCase(DESCENDING) ? -1 : 1;
else if (t1.getRank() < t2.getRank())
return -1;
else if (t1.getRank() > t2.getRank())
return 1;
else
return e1.getId().compareTo(e2.getId()); //If rank null or equal compute rank based on id.
}
return 0;
}

Understanding the order() function

This seems to explain it.

The definition of order is that a[order(a)] is in
increasing order. This works with your example, where the correct
order is the fourth, second, first, then third element.

You may have been looking for rank, which returns the rank of the
elements

R> a <- c(4.1, 3.2, 6.1, 3.1)
R> order(a)
[1] 4 2 1 3
R> rank(a)
[1] 3 2 4 1
so rank tells you what order the numbers are in,
order tells you how to get them in ascending order.

plot(a, rank(a)/length(a)) will give a graph of the CDF. To see why
order is useful, though, try plot(a, rank(a)/length(a),type="S")
which gives a mess, because the data are not in increasing order

If you did

oo<-order(a)
plot(a[oo],rank(a[oo])/length(a),type="S")
or simply

oo<-order(a)
plot(a[oo],(1:length(a))/length(a)),type="S")
you get a line graph of the CDF.

I'll bet you're thinking of rank.

Strange behavior when using apply with rank and order on a data.frame with ordered factors

As requested by the OP, here is a detailed explanation which may help other R users to evade the traps.

Trap 1

As joran has pointed out, apply coerces the data frame into a matrix thereby replacing the ordered factors by characters. So, the original data.frame

data1
x y z
1 6 9 10
2 1 3 1
3 3 8 8
4 3 10 3

becomes

as.matrix(data1)
x y z
[1,] "6" "9" "10"
[2,] "1" "3" "1"
[3,] "3" "8" "8"
[4,] "3" "10" "3"

Trap 2

Characters are sorted lexically. Thus, sorting the y column as character returns

sort(c("9", "3", "8", "10"))
[1] "10" "3" "8" "9"

instead of

sort(c(9, 3, 8, 10))
[1] 3 8 9 10

This explains why apply returns a different result for the rank operation here.

Solution

You can use lapply to compute the rank of each column of the data frame.

as.data.frame(lapply(data1, rank))
x y z
1 4.0 3 4
2 1.0 1 1
3 2.5 2 3
4 2.5 4 2

lapply returns a list and a data frame is a special kind of list.

Avoid sapply because sapply takes the output of lapplyand "simplifies" it to something what it thinks is appropriate. Here,

sapply(data1, rank)
x y z
[1,] 4.0 3 4
[2,] 1.0 1 1
[3,] 2.5 2 3
[4,] 2.5 4 2

returns a matrix (again!) which needs to be coerced to a data frame. (See chapter 8.3.20 of The R Inferno by Patrick Burns.The text is a good read, anyway.)

Alternative Solution

The OP has not given an indication why he needs to work with ordered factors. If factors, ordered or not, are not essential to the OPs underlying problem, then applywould have worked as expected.

set.seed(4)
x2 <- sample(1:10, size = 4, replace = T)
y2 <- sample(1:10, size = 4, replace = T)
z2 <- sample(1:10, size = 4, replace = T)
data2 <- data.frame(x2, y2, z2)
data2
x2 y2 z2
1 6 9 10
2 1 3 1
3 3 8 8
4 3 10 3
apply(data2, 2, rank)
x2 y2 z2
[1,] 4.0 3 4
[2,] 1.0 1 1
[3,] 2.5 2 3
[4,] 2.5 4 2

(Nevertheless, better to use lapply instead of apply with a data frame).

Trap 3

When I started to learn R, I was misled by the name of the function ordered(). It took me a while to understand that it creates a special kind of factors. Likewise, it took me some time to figure out the difference between sort() and order() and when to use which function appropriately.

How to sort an array(in ascending order) based on the ranks of each element which is in the another array?

The short answer is: std::sort.

The long answer depends on how you store the values and the ranks, whether you can afford to copy them, and whether you also need the ranks sorted. For the following I assumed that you do not need the ranks sorted and the sorting is done on the original container of values. It is not the most efficient implementation, but it is simple enough to get you started:

#include <vector>
#include <utility>
#include <algorithm>
#include <iostream>
#include <iterator>

template <typename Iter,typename RankIter>
void rank_sort(Iter begin,Iter end,RankIter rankbegin){
std::vector<std::pair<typename std::iterator_traits<RankIter>::value_type,
typename std::iterator_traits<Iter>::value_type >> res;
for (auto beg = begin; beg != end; ++beg,++rankbegin) res.emplace_back(*rankbegin,*beg);
std::sort(res.begin(),res.end());
for (const auto& e : res){
*begin = e.second;
++begin;
}
}

int main() {
std::vector<int> arr{1,5,6,3,10};
std::vector<int> rank{100,0,1,100,2};
rank_sort(arr.begin(),arr.end(),rank.begin());
for (const auto& a : arr) { std::cout << a << " ";}
}

The basic idea is to create a std::vector<std::pair<int,int>> and then simply sort that via std::sort. The rest of the code is about copying the values and ranks into that vector and copying the values out of it after sorting.

If possible you should store the values and ranks in a std::vector<std::pair<int,int>> in the first place. Then sorting them is trivial. Alternatively, as mentioned in a comment, you can use a sorted container, like eg a std::map.

Javascript sort array by a list of ranks

you first need a reference for the rank of the officer, the higher the greater the weight. So let's make a hash and order it in some way of the ranks. For this I'll choose an descending order since you've provided that already:

let ranks = {
CFO: 1,
DCFO: 2,
SSO: 3,
SO: 4,
SFF: 5,
QFF: 6,
FF: 7,
RFF: 8,
OS: 9
}

Next we want to organize our array of officers by their rank so we'll need a comparison function

compareRank( left, right ){
return ranks[left.rank] - ranks[right.rank]
}

next we need to sort our array by the sort function

let sortedOfficers = officers.sort(compareRank)

They will now be sorted in descending order(really ascending, however we gave higher ranks lesser value, so it is reversed).


Another ways to sort would include concatenating of lists ranks in the order you wanted to sort them based on the ranks you wanted to do

function fieldIs(key, value){ return function(object){ return object[key] == value } }

let sortedOfficers = [].concatenate(
officers.filter(fieldIs('rank', 'CFO'),
officers.filter(fieldIs('rank', 'DCFO'),
officers.filter(fieldIs('rank', 'SSO'),
officers.filter(fieldIs('rank', 'SO'),
officers.filter(fieldIs('rank', 'SFF'),
officers.filter(fieldIs('rank', 'QFF'),
officers.filter(fieldIs('rank', 'FF'),
officers.filter(fieldIs('rank', 'RFF'),
officers.filter(fieldIs('rank', 'OS')
)

This method, while more verbose and much slower, is a little more adaptable since you could easily compose the parts based on an input ordered array of values for the field to be

function fieldSort(field, orderedValues, things){
return [].concatenate(
...orderedValues.map(value =>
things.filter(
fieldIs(
field,
value
)
)
)
}

There are of course may other ways to sort, but these are just a few examples to help show that if you have some sort of definition of order, you can sort based on that order. Now the real question becomes how would you combine the two methods based on the idea that using Array.prototype.reduce() on the orderedValues array could be used to create that hash in the first example?

Let's try it:

function fieldSort(field, orderedValues, things){
let valueHash = orderedValues.reduce((acc, value, index) =>{
acc[value] = index
return acc
},
{}
)
return things.sort((left, right) =>
valueHash[left[field]] - valueHash[right[field]]
)
}

The good news about this third option: It removes all of the looping form the second option. It loops only as much as needed to sort + one loop over the ordered array. This make it much faster than the second example by taking the ideas from the first example.

Ranking list in ascending order but keeping the index same

Basically, my answer is an addition to Alex's (which posted 2 minutes after I left this comment).

There's built-in enumerate() function which yields tuple of index and value of each item in iterable, you may pass it's result directly into sorted() with itemgetter(1) as key argument. It will return list of tuples (index, value) sorted by second element (which is value).

To get two separate lists from list of pairs, you can unpack list into zip() and assign result to two separate variables.

Code:

from operator import itemgetter

a = [1,2,3,4,5,22,32,43,53,101]
indexes, values = zip(*sorted(enumerate(a), key=itemgetter(1)))


Related Topics



Leave a reply



Submit