How to Get Ranks with No Gaps When There Are Ties Among Values

How to get ranks with no gaps when there are ties among values?

I can think of a quick function to do this. It's not optimal with a for loop but it works:)

x=c(1,1,2,3,4,5,8,8)

foo <- function(x){
su=sort(unique(x))
for (i in 1:length(su)) x[x==su[i]] = i
return(x)
}

foo(x)

[1] 1 1 2 3 4 5 6 6

How to break ties in a ranking with gaps in ranking

We may add row_number() after grouping

library(dplyr)
data %>%
group_by(orig) %>%
mutate(want = orig + row_number() - 1) %>%
ungroup

-ouptut

# A tibble: 8 x 2
orig want
<dbl> <dbl>
1 1 1
2 5 5
3 5 6
4 5 7
5 14 14
6 18 18
7 18 19
8 25 25

Or may simplify with rowid from data.table

library(data.table)
data %>%
mutate(want = orig + rowid(orig)-1)

ranking column in r without skipping ties

We can use dense_rank

library(dplyr)
df %>%
group_by(A) %>%
mutate(C = dense_rank(-B)) %>%
ungroup

-output

# A tibble: 3 × 3
A B C
<chr> <int> <int>
1 group1 325 1
2 group1 325 1
3 group1 123 2

data

df <- structure(list(A = c("group1", "group1", "group1"), B = c(325L, 
325L, 123L)), class = "data.frame", row.names = c(NA, -3L))

How to get relative rankings of numeric elements in a list or vector in R?

I think you are looking for dplyr::dense_rank():

# Example 1
dplyr::dense_rank(c(1, 1, 1, 3, 1, 4, 1))
#> [1] 1 1 1 2 1 3 1

# Example 2
dplyr::dense_rank(c(4, 1, 1, 1, 3, 5, 1))
#> [1] 3 1 1 1 2 4 1

# Example in code
dplyr::dense_rank(c(1, 1, 1, 2, 1, 3, 1))
#> [1] 1 1 1 2 1 3 1

How I can create a new ties.method with the R rank() function?

I believe there is no option to do it with rank; here is a custom function that will do what you want, but it may be too slow if your data is huge:

Rank<-function(d) {
j<-unique(rev(sort(d)));
return(sapply(d,function(dd) which(dd==j)));
}

Is there an easy way to rank values in R using two criteria (the second one being for the ties)?

Assuming that all ties are actually broken:

order(order(a, b))
#[1] 5 7 3 2 4 1 6

There are probably more efficient alternatives.

Rank within groups in R with special NA handling

We can group by 'B', rank on 'C', specify the i with a logical condition to select only the non-NA elements from 'C' and assign (:=) the rank values to create the 'RANK' column. By default, the rows that are not used i.e. NA will be NA in the new column

library(data.table)
setDT(df)[!is.na(C), RANK := rank(-C) , B]
df
# A B C RANK
# 1: A V1 1 4.0
# 2: A V2 2 3.5
# 3: A V3 3 3.0
# 4: B V1 5 1.0
# 5: B V2 2 3.5
# 6: B V3 NA NA
# 7: C V1 4 2.0
# 8: C V2 6 2.0
# 9: C V3 7 2.0
#10: D V1 3 3.0
#11: D V2 7 1.0
#12: D V3 8 1.0

Rank by two columns and keep ties

We can use frank from data.table with dense as ties.method after grouping by 'ID' on the absolute difference between the 'Date' and the reference date ('2015-01-31')

library(data.table)
setDT(df)[, Sequence := frank(abs(as.IDate(Date, "%d/%m/%Y")-
as.IDate("2015-01-31")), ties.method = "dense"), by = ID]
df
# ID Date Sequence
#1: A 01/01/2015 3
#2: A 02/01/2015 2
#3: A 02/01/2015 2
#4: A 02/01/2015 2
#5: A 05/01/2015 1
#6: B 01/01/2015 1

data

df <- structure(list(ID = c("A", "A", "A", "A", "A", "B"), Date = c("01/01/2015", 
"02/01/2015", "02/01/2015", "02/01/2015", "05/01/2015", "01/01/2015"
)), .Names = c("ID", "Date"), class = "data.frame", row.names = c(NA,
-6L))


Related Topics



Leave a reply



Submit