Rank Per Row Over Multiple Columns in R

Rank per row over multiple columns in R

You're looking for rank. To get decreasing order, first negate the data.frame.

data.frame(d, t(apply(-d, 1, rank, ties.method='min')))
# V1 V2 V3 V1.1 V2.1 V3.1
# 1 11 21 35 3 2 1
# 2 22 12 66 2 3 1
# 3 44 22 12 1 2 3

ranking dataframe using two columns in R

You can use data.table::frank or dplyr::min_rank:

data.table::frank

dt$Rank <- frank(dt, B, A, ties.method = "min")
dt
A B Rank
1 1 1 1
2 2 1 2
3 2 1 2
4 4 4 5
5 5 3 4

dplyr::min_rank

mutate(dt, Rank = min_rank(paste(B,A)))
A B Rank
1 1 1 1
2 2 1 2
3 2 1 2
4 4 4 5
5 5 3 4

Data

dt <- data.frame(A = c(1,2,2,4,5), B = c(1,1,1,4,3))

tidyverse calculate ranking per row across several columns

We can use unnest_wider

library(dplyr)
library(tidyr)
library(stringr)
dat %>%
rowwise() %>%
mutate(my_ranks = list(rank(c_across(starts_with("x"))))) %>%
unnest_wider(c(my_ranks)) %>%
rename_at(vars(starts_with("...")), ~ str_replace(., fixed("..."), "rank_x"))
# A tibble: 4 x 7
# id x1 x2 x3 rank_x1 rank_x2 rank_x3
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 a 1 4 2 1 3 2
#2 b 3 2 2 3 1.5 1.5
#3 c 5 6 5 1.5 3 1.5
#4 d 7 0 9 2 1 3

Another option is pmap/as_tibble_row

library(tibble)
library(purrr)
dat %>%
mutate(my_ranks = pmap(select(., starts_with('x')), ~
as_tibble_row(rank(c(...)),
.name_repair = ~ str_c('rank', seq_along(.))))) %>%
unnest(c(my_ranks))
# A tibble: 4 x 7
# id x1 x2 x3 rank1 rank2 rank3
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 a 1 4 2 1 3 2
#2 b 3 2 2 3 1.5 1.5
#3 c 5 6 5 1.5 3 1.5
#4 d 7 0 9 2 1 3

It can be done more straightforward with rowRanks from matrixStats

library(matrixStats)
nm1 <- names(dat)[-1]
dat[paste0('rank', nm1)] <- rowRanks(as.matrix(dat[nm1]), ties.method = 'average')

Adding multiple ranking columns in a dataframe in R

We can loop across the columns 'home' to 'work', apply the rank, while creating new column by adding prefix in .names, and probably select to keep the order

library(dplyr)
df1 <- df %>%
mutate(across(home:work, ~ rank(-.), .names = "rank_{.col}"))

Or may do this in a loop where it is more flexible in placing the column at a particular position by specifying either .after or .before. Note that we used compound assignment operator (%<>% from magrittr) to do the assignment in place

library(magrittr)
library(stringr)
for(nm in names(df)[4:6]) df %<>%
mutate(!!str_c("rank_", nm) := rank(-.data[[nm]]), .after = all_of(nm))

-output

df
id city uf home rank_home money rank_money work rank_work
1 34 LA RJ 10 1 2 6 2 6
2 33 BA TY 7 2 3 5 65 1
3 32 NY BN 4 4 5 4 4 5
4 12 SP SD 3 5 9 2 7 4
5 14 FR DE 1 6 8 3 9 2
6 17 BL DE 5 3 10 1 8 3

NOTE: If the column have ties, then the default method use is "average". So, ties.method can also be an argument in the rank where there are ties.

data

df <- structure(list(id = c(34L, 33L, 32L, 12L, 14L, 17L), city = c("LA", 
"BA", "NY", "SP", "FR", "BL"), uf = c("RJ", "TY", "BN", "SD",
"DE", "DE"), home = c(10L, 7L, 4L, 3L, 1L, 5L), money = c(2L,
3L, 5L, 9L, 8L, 10L), work = c(2L, 65L, 4L, 7L, 9L, 8L)),
class = "data.frame", row.names = c(NA,
-6L))

How to rank rows by two columns at once in R?

How about:

within(x, rank2 <- rank(order(v2, v1), ties.method='first'))

# v1 v2 rank1 rank2
# 1 2 1 1 2
# 2 1 1 2 1
# 3 1 3 4 4
# 4 2 2 3 3

Ranking numerical values by row in data frame in R base

An option would be to use apply to loop over the rows (MARGIN = 1) and use rank

t(apply(df0, 1, rank))

Or use rowRanks from matrixStats after converting to matrix

library(matrixStats)
rowRanks(as.matrix(df0))

Apply the rank function to multiple columns at once

We can use lapply or sapply

df2 <- df1
df2[] <- lapply(df1, rank)

Or we can use dplyr

library(dplyr)
df %>%
mutate(across(everything(), rank))

data

df1 <- structure(list(a = c(24L, 27L, 29L, 34L, 76L), b = c(35L, 12L, 
76L, 54L, NA), c = c(76L, 43L, 56L, 52L, NA)), class = "data.frame",
row.names = c(NA,
-5L))


Related Topics



Leave a reply



Submit