Rank Vector with Some Equal Values

Rank vector with some equal values

Convert to factor and back to numeric

as.numeric(as.factor(rank(-x)))
#[1] 6 1 5 3 3 2 4

how to rank values in a vector and give them corresponding values?

That's more clear. Hence :

> vect = c(41,42,5,6,3,12,10,15,2,3,4,13,2,33,4,1,1)
> cbind(vect,as.numeric(factor(vect)))
 [1,]   41 12
 [2,]   42 13
 [3,]    5  5
 [4,]    6  6
 [5,]    3  3
 [6,]   12  8
 [7,]   10  7
 [8,]   15 10
 [9,]    2  2
[10,]    3  3
[11,]    4  4
[12,]   13  9
[13,]    2  2
[14,]   33 11
[15,]    4  4
[16,]    1  1
[17,]    1  1

No sort needed. And as said, see also ?factor

and if you want to preserve the order, then:

> cbind(vect,as.numeric(factor(vect,levels=unique(vect))))
      vect   
 [1,]   41  1
 [2,]   42  2
 [3,]    5  3
 [4,]    6  4
 [5,]    3  5
 [6,]   12  6
 [7,]   10  7
 [8,]   15  8
 [9,]    2  9
[10,]    3  5
[11,]    4 10
[12,]   13 11
[13,]    2  9
[14,]   33 12
[15,]    4 10
[16,]    1 13
[17,]    1 13

Looking for an FP ranking implementation which handles ties (i.e. equal values)

This works well for me:

// scala
val vs = Vector(1, 1, 3, 3, 3, 5, 6)
val rank = vs.distinct.zipWithIndex.toMap
val result = vs.map(i => (rank(i), i))

The same in Java 8 using Javaslang:

// java(slang)
Vector<Integer>                  vs = Vector(1, 1, 3, 3, 3, 5, 6);
Function<Integer, Integer>       rank = vs.distinct().zipWithIndex().toMap(t -> t);
Vector<Tuple2<Integer, Integer>> result = vs.map(i -> Tuple(rank.apply(i), i));

The output of both variants is

Vector((0, 1), (0, 1), (1, 3), (1, 3), (1, 3), (2, 5), (3, 6))

*) Disclosure: I'm the creator of Javaslang

Create ranking for vector of double

One way to do so would be using a multimap.

Place the items in a multimap mapping your objects to size_ts (the intial values are unimportant). You can do this with one line (use the ctor that takes iterators).
Loop (either plainly or using whatever from algorithm) and assign 0, 1, ... as the values.
Loop over the distinct keys. For each distinct key, call equal_range for the key, and set its values to the average (again, you can use stuff from algorithm for this).

The overall complexity should be Theta(n log(n)), where n is the length of the vector.

How to get ranks with no gaps when there are ties among values?

I can think of a quick function to do this. It's not optimal with a for loop but it works:)

x=c(1,1,2,3,4,5,8,8)

foo <- function(x){
    su=sort(unique(x))
    for (i in 1:length(su)) x[x==su[i]] = i
    return(x)
}

foo(x)

[1] 1 1 2 3 4 5 6 6

Efficient method to calculate the rank vector of a list in Python

Using scipy, the function you are looking for is scipy.stats.rankdata:

In [13]: import scipy.stats as ss
In [19]: ss.rankdata([3, 1, 4, 15, 92])
Out[19]: array([ 2.,  1.,  3.,  4.,  5.])

In [20]: ss.rankdata([1, 2, 3, 3, 3, 4, 5])
Out[20]: array([ 1.,  2.,  4.,  4.,  4.,  6.,  7.])

The ranks start at 1, rather than 0 (as in your example), but then again, that's the way R's rank function works as well.

Here is a pure-python equivalent of scipy's rankdata function:

def rank_simple(vector):
    return sorted(range(len(vector)), key=vector.__getitem__)

def rankdata(a):
    n = len(a)
    ivec=rank_simple(a)
    svec=[a[rank] for rank in ivec]
    sumranks = 0
    dupcount = 0
    newarray = [0]*n
    for i in xrange(n):
        sumranks += i
        dupcount += 1
        if i==n-1 or svec[i] != svec[i+1]:
            averank = sumranks / float(dupcount) + 1
            for j in xrange(i-dupcount+1,i+1):
                newarray[ivec[j]] = averank
            sumranks = 0
            dupcount = 0
    return newarray

print(rankdata([3, 1, 4, 15, 92]))
# [2.0, 1.0, 3.0, 4.0, 5.0]
print(rankdata([1, 2, 3, 3, 3, 4, 5]))
# [1.0, 2.0, 4.0, 4.0, 4.0, 6.0, 7.0]

Rank Vector with Some Equal Values