How to get ranks with no gaps when there are ties among values?
I can think of a quick function to do this. It's not optimal with a for loop but it works:)
x=c(1,1,2,3,4,5,8,8)
foo <- function(x){
su=sort(unique(x))
for (i in 1:length(su)) x[x==su[i]] = i
return(x)
}
foo(x)
[1] 1 1 2 3 4 5 6 6
How to break ties in a ranking with gaps in ranking
We may add row_number()
after grouping
library(dplyr)
data %>%
group_by(orig) %>%
mutate(want = orig + row_number() - 1) %>%
ungroup
-ouptut
# A tibble: 8 x 2
orig want
<dbl> <dbl>
1 1 1
2 5 5
3 5 6
4 5 7
5 14 14
6 18 18
7 18 19
8 25 25
Or may simplify with rowid
from data.table
library(data.table)
data %>%
mutate(want = orig + rowid(orig)-1)
ranking column in r without skipping ties
We can use dense_rank
library(dplyr)
df %>%
group_by(A) %>%
mutate(C = dense_rank(-B)) %>%
ungroup
-output
# A tibble: 3 × 3
A B C
<chr> <int> <int>
1 group1 325 1
2 group1 325 1
3 group1 123 2
data
df <- structure(list(A = c("group1", "group1", "group1"), B = c(325L,
325L, 123L)), class = "data.frame", row.names = c(NA, -3L))
How to get relative rankings of numeric elements in a list or vector in R?
I think you are looking for dplyr::dense_rank()
:
# Example 1
dplyr::dense_rank(c(1, 1, 1, 3, 1, 4, 1))
#> [1] 1 1 1 2 1 3 1
# Example 2
dplyr::dense_rank(c(4, 1, 1, 1, 3, 5, 1))
#> [1] 3 1 1 1 2 4 1
# Example in code
dplyr::dense_rank(c(1, 1, 1, 2, 1, 3, 1))
#> [1] 1 1 1 2 1 3 1
How I can create a new ties.method with the R rank() function?
I believe there is no option to do it with rank; here is a custom function that will do what you want, but it may be too slow if your data is huge:
Rank<-function(d) {
j<-unique(rev(sort(d)));
return(sapply(d,function(dd) which(dd==j)));
}
Is there an easy way to rank values in R using two criteria (the second one being for the ties)?
Assuming that all ties are actually broken:
order(order(a, b))
#[1] 5 7 3 2 4 1 6
There are probably more efficient alternatives.
Rank within groups in R with special NA handling
We can group by 'B', rank
on 'C', specify the i
with a logical condition to select only the non-NA elements from 'C' and assign (:=
) the rank
values to create the 'RANK' column. By default, the rows that are not used i.e. NA will be NA in the new column
library(data.table)
setDT(df)[!is.na(C), RANK := rank(-C) , B]
df
# A B C RANK
# 1: A V1 1 4.0
# 2: A V2 2 3.5
# 3: A V3 3 3.0
# 4: B V1 5 1.0
# 5: B V2 2 3.5
# 6: B V3 NA NA
# 7: C V1 4 2.0
# 8: C V2 6 2.0
# 9: C V3 7 2.0
#10: D V1 3 3.0
#11: D V2 7 1.0
#12: D V3 8 1.0
Rank by two columns and keep ties
We can use frank
from data.table
with dense
as ties.method
after grouping by 'ID' on the abs
olute difference between the 'Date' and the reference date ('2015-01-31')
library(data.table)
setDT(df)[, Sequence := frank(abs(as.IDate(Date, "%d/%m/%Y")-
as.IDate("2015-01-31")), ties.method = "dense"), by = ID]
df
# ID Date Sequence
#1: A 01/01/2015 3
#2: A 02/01/2015 2
#3: A 02/01/2015 2
#4: A 02/01/2015 2
#5: A 05/01/2015 1
#6: B 01/01/2015 1
data
df <- structure(list(ID = c("A", "A", "A", "A", "A", "B"), Date = c("01/01/2015",
"02/01/2015", "02/01/2015", "02/01/2015", "05/01/2015", "01/01/2015"
)), .Names = c("ID", "Date"), class = "data.frame", row.names = c(NA,
-6L))
Related Topics
Displaying a PDF from a Local Drive in Shiny
Apply a Function Over Groups of Columns
Comparing Two Vectors in an If Statement
Forward and Backward Fill Data Frame in R
Compile R Script into Standalone .Exe File
R on Windows: Character Encoding Hell
Merge Three Different Columns into a Date in R
How to Install a R Package on a Offline Debian MAChine
Duplicating (And Modifying) Discrete Axis in Ggplot2
Spread With Data.Frame/Tibble With Duplicate Identifiers
Download a File from Https Using Download.File()
Save Plot with a Given Aspect Ratio
Returning Anonymous Functions from Lapply - What Is Going Wrong
Specifying Column Names in a Data.Frame Changes Spaces to "."
Convert from Billion to Million and Vice Versa
Global Variables in Packages in R