How to Preserve Continuous (1,2,3,...N) Ranking Notation When Ranking in R

How do I preserve continuous (1,2,3,...n) ranking notation when ranking in R?

you could do it using dplyr:

library(dplyr)
dense_rank(dat)

[1] 1 1 2 3 3 3 3 3 3 4 5 6 7 8 9

if you don't want to load the whole library and do it in base r:

match(dat, sort(unique(dat)))

[1] 1 1 2 3 3 3 3 3 3 4 5 6 7 8 9

How to get ranks with no gaps when there are ties among values?

I can think of a quick function to do this. It's not optimal with a for loop but it works:)

x=c(1,1,2,3,4,5,8,8)

foo <- function(x){
su=sort(unique(x))
for (i in 1:length(su)) x[x==su[i]] = i
return(x)
}

foo(x)

[1] 1 1 2 3 4 5 6 6

ranking column in r without skipping ties

We can use dense_rank

library(dplyr)
df %>%
group_by(A) %>%
mutate(C = dense_rank(-B)) %>%
ungroup

-output

# A tibble: 3 × 3
A B C
<chr> <int> <int>
1 group1 325 1
2 group1 325 1
3 group1 123 2

data

df <- structure(list(A = c("group1", "group1", "group1"), B = c(325L, 
325L, 123L)), class = "data.frame", row.names = c(NA, -3L))

Is there an easy way to rank values in R using two criteria (the second one being for the ties)?

Assuming that all ties are actually broken:

order(order(a, b))
#[1] 5 7 3 2 4 1 6

There are probably more efficient alternatives.

Rank vector with some equal values

Convert to factor and back to numeric

as.numeric(as.factor(rank(-x)))
#[1] 6 1 5 3 3 2 4

Is there a simple way to rank on multiple criteria that preserves ties in R?

interaction does what you need:

> rank(interaction(c(2,4,1,4,5),c(10,11,12,11,13), lex.order=TRUE))
[1] 2.0 3.5 1.0 3.5 5.0

Here is what is happening.

interaction expects factors, so the vectors are coerced. Doing so produces the order in the factor levels as indicated by sort.list, which for numeric is numerically nondecreasing order.

Then to combine the two factors, the interaction creates factor levels by varying the second argument fastest (because lex.order=TRUE). Thus ties in the first vector are resolved by the value in the second vector (if possible).

Finally, rank coerces the resulting factor to numeric.

What is actually ranked:

> as.numeric(interaction(c(2,4,1,4,5),c(10,11,12,11,13), lex.order=TRUE))
[1] 5 10 3 10 16

You will save some memory if you supply the option drop=TRUE to interaction. This will change the ranked numeric values, but not their order, so the final result is the same.

Rank within groups in R with special NA handling

We can group by 'B', rank on 'C', specify the i with a logical condition to select only the non-NA elements from 'C' and assign (:=) the rank values to create the 'RANK' column. By default, the rows that are not used i.e. NA will be NA in the new column

library(data.table)
setDT(df)[!is.na(C), RANK := rank(-C) , B]
df
# A B C RANK
# 1: A V1 1 4.0
# 2: A V2 2 3.5
# 3: A V3 3 3.0
# 4: B V1 5 1.0
# 5: B V2 2 3.5
# 6: B V3 NA NA
# 7: C V1 4 2.0
# 8: C V2 6 2.0
# 9: C V3 7 2.0
#10: D V1 3 3.0
#11: D V2 7 1.0
#12: D V3 8 1.0

Rank function inconsistency with the expected output in R

There's probably a more efficient/shorter way to compute the unique values of the union of all instances, but otherwise this is pretty much as @whuber suggested in the comments:

Test case:

instances <- list(c(2,3,4,4,5,6),c(2,3,3,3,4,2))

The only tricky part is making sure we have the full range of levels so that zeros get counted properly:

ulevs <- sort(unique(Reduce(union,instances)))
f <- function(x) {
table(factor(x,levels=ulevs))
}

Apply and convert to a matrix:

t(sapply(instances,f))
## 2 3 4 5 6
## [1,] 1 1 2 1 1
## [2,] 2 3 1 0 0

Rank doesn't start at 1 in R

You need to use dense_rank.

test <- data.frame(column1 = c(5,5,5,6,6,7,7,7,8))
test$rank <- dplyr::dense_rank(test$column1)

Working of window ranking function

test %>% rename(input = column1) %>% 
mutate(row_num_output = row_number(input),
rank_output = min_rank(input),
dense_rank_output = dense_rank(input))

Output to give a better understanding for your input

Sample Image



Related Topics



Leave a reply



Submit