How do I preserve continuous (1,2,3,...n) ranking notation when ranking in R?
you could do it using dplyr:
library(dplyr)
dense_rank(dat)
[1] 1 1 2 3 3 3 3 3 3 4 5 6 7 8 9
if you don't want to load the whole library and do it in base r:
match(dat, sort(unique(dat)))
[1] 1 1 2 3 3 3 3 3 3 4 5 6 7 8 9
How to get ranks with no gaps when there are ties among values?
I can think of a quick function to do this. It's not optimal with a for loop but it works:)
x=c(1,1,2,3,4,5,8,8)
foo <- function(x){
su=sort(unique(x))
for (i in 1:length(su)) x[x==su[i]] = i
return(x)
}
foo(x)
[1] 1 1 2 3 4 5 6 6
ranking column in r without skipping ties
We can use dense_rank
library(dplyr)
df %>%
group_by(A) %>%
mutate(C = dense_rank(-B)) %>%
ungroup
-output
# A tibble: 3 × 3
A B C
<chr> <int> <int>
1 group1 325 1
2 group1 325 1
3 group1 123 2
data
df <- structure(list(A = c("group1", "group1", "group1"), B = c(325L,
325L, 123L)), class = "data.frame", row.names = c(NA, -3L))
Is there an easy way to rank values in R using two criteria (the second one being for the ties)?
Assuming that all ties are actually broken:
order(order(a, b))
#[1] 5 7 3 2 4 1 6
There are probably more efficient alternatives.
Rank vector with some equal values
Convert to factor and back to numeric
as.numeric(as.factor(rank(-x)))
#[1] 6 1 5 3 3 2 4
Is there a simple way to rank on multiple criteria that preserves ties in R?
interaction
does what you need:
> rank(interaction(c(2,4,1,4,5),c(10,11,12,11,13), lex.order=TRUE))
[1] 2.0 3.5 1.0 3.5 5.0
Here is what is happening.
interaction
expects factors, so the vectors are coerced. Doing so produces the order in the factor levels as indicated by sort.list
, which for numeric
is numerically nondecreasing order.
Then to combine the two factors, the interaction creates factor levels by varying the second argument fastest (because lex.order=TRUE
). Thus ties in the first vector are resolved by the value in the second vector (if possible).
Finally, rank
coerces the resulting factor to numeric
.
What is actually ranked:
> as.numeric(interaction(c(2,4,1,4,5),c(10,11,12,11,13), lex.order=TRUE))
[1] 5 10 3 10 16
You will save some memory if you supply the option drop=TRUE
to interaction
. This will change the ranked numeric values, but not their order, so the final result is the same.
Rank within groups in R with special NA handling
We can group by 'B', rank
on 'C', specify the i
with a logical condition to select only the non-NA elements from 'C' and assign (:=
) the rank
values to create the 'RANK' column. By default, the rows that are not used i.e. NA will be NA in the new column
library(data.table)
setDT(df)[!is.na(C), RANK := rank(-C) , B]
df
# A B C RANK
# 1: A V1 1 4.0
# 2: A V2 2 3.5
# 3: A V3 3 3.0
# 4: B V1 5 1.0
# 5: B V2 2 3.5
# 6: B V3 NA NA
# 7: C V1 4 2.0
# 8: C V2 6 2.0
# 9: C V3 7 2.0
#10: D V1 3 3.0
#11: D V2 7 1.0
#12: D V3 8 1.0
Rank function inconsistency with the expected output in R
There's probably a more efficient/shorter way to compute the unique values of the union of all instances, but otherwise this is pretty much as @whuber suggested in the comments:
Test case:
instances <- list(c(2,3,4,4,5,6),c(2,3,3,3,4,2))
The only tricky part is making sure we have the full range of levels so that zeros get counted properly:
ulevs <- sort(unique(Reduce(union,instances)))
f <- function(x) {
table(factor(x,levels=ulevs))
}
Apply and convert to a matrix:
t(sapply(instances,f))
## 2 3 4 5 6
## [1,] 1 1 2 1 1
## [2,] 2 3 1 0 0
Rank doesn't start at 1 in R
You need to use dense_rank.
test <- data.frame(column1 = c(5,5,5,6,6,7,7,7,8))
test$rank <- dplyr::dense_rank(test$column1)
Working of window ranking function
test %>% rename(input = column1) %>%
mutate(row_num_output = row_number(input),
rank_output = min_rank(input),
dense_rank_output = dense_rank(input))
Output to give a better understanding for your input
Related Topics
Max and Min Functions That Are Similar to Colmeans
1-Dimensional Matrix Is Changed to a Vector in R
Extract Last Non-Missing Value in Row with Data.Table
Repeat the Re-Sampling Function for 1000 Times? Using Lapply
R Ggplot Ordering Bars in "Barplot-Like " Plot
Shiny Rcharts Multiple Chart Output
Plot Separate Years on a Common Day-Month Scale
R Shiny Loop to Display Multiple Plots
Generally Disable Dimension Dropping for Matrices
How to Automatically Load Data in an R Package
How to Change Name of Factor Levels
How to Add Legend to Geom_Smooth in Ggplot in R
How to 'Unlist' a Column in a Data.Table
Adding a 3Rd Order Polynomial and Its Equation to a Ggplot in R