rank and order in R
set.seed(1)
x <- sample(1:50, 30)
x
# [1] 14 19 28 43 10 41 42 29 27 3 9 7 44 15 48 18 25 33 13 34 47 39 49 4 30 46 1 40 20 8
rank(x)
# [1] 9 12 16 25 7 23 24 17 15 2 6 4 26 10 29 11 14 19 8 20 28 21 30 3 18 27 1 22 13 5
order(x)
# [1] 27 10 24 12 30 11 5 19 1 14 16 2 29 17 9 3 8 25 18 20 22 28 6 7 4 13 26 21 15 23
rank
returns a vector with the "rank" of each value. the number in the first position is the 9th lowest. order
returns the indices that would put the initial vector x
in order.
The 27th value of x
is the lowest, so 27
is the first element of order(x)
- and if you look at rank(x)
, the 27th element is 1
.
x[order(x)]
# [1] 1 3 4 7 8 9 10 13 14 15 18 19 20 25 27 28 29 30 33 34 39 40 41 42 43 44 46 47 48 49
Sort a data frame in R largest to smallest from value in a column
Could you please try following and let me know if this helps you.
library(dplyr)
df[with(df, order(-Letter)), ] %>% select (Number)
Output will be as follows.
6 F
5 E
4 D
3 C
2 B
1 A
>
Data created by as follows:
df <- data.frame(
v1 = c('A','B','C','D','E','F'),
v2 = c(1,2,3,4,5,6),
v3 = c(11,12,12.5,11.5,11.75,13)
)
colnames(df)<-c("Number","Letter","Age")
Trouble using dplyr::order to rank values from smallest to largest including positive integers smaller than 1
We need rank
instead of order
. According to ?rank
Returns the sample ranks of the values in a vector.
library(dplyr)
data %>%
group_by(pitch_2) %>%
mutate(rank = order(euclid_dist))
# A tibble: 6 x 4
# Groups: pitch_2 [1]
# pitch_1 pitch_2 euclid_dist rank
# <chr> <chr> <dbl> <dbl>
#1 429721-CU 493247-SI 2.53 2
#2 114849-FC 493247-SI 3.52 6
#3 430599-FF 493247-SI 3.49 5
#4 458567-FF 493247-SI 2.59 3
#5 435261-CU 493247-SI 3.1 4
#6 425629-CU 493247-SI 2.14 1
data
data <- structure(list(pitch_1 = c("429721-CU", "114849-FC", "430599-FF",
"458567-FF", "435261-CU", "425629-CU"), pitch_2 = c("493247-SI",
"493247-SI", "493247-SI", "493247-SI", "493247-SI", "493247-SI"
), euclid_dist = c(2.53, 3.52, 3.49, 2.59, 3.1, 2.14), rank = c(15L,
6L, 14L, 27L, 8L, 17L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
How to rank within groups in R?
You can do this pretty cleanly with dplyr
library(dplyr)
df %>%
group_by(customer_name) %>%
mutate(my_ranks = order(order(order_values, order_dates, decreasing=TRUE)))
Source: local data frame [5 x 4]
Groups: customer_name
customer_name order_dates order_values my_ranks
1 John 2010-11-01 15 3
2 Bob 2008-03-25 12 1
3 Alex 2009-11-15 5 1
4 John 2012-08-06 15 2
5 John 2015-05-07 20 1
Create a ranking variable with dplyr?
It sounds like you're looking for dense_rank
from "dplyr" -- but applied in a reverse order than what rank
normally does.
Try this:
df %>% mutate(rank = dense_rank(desc(score)))
# name score rank
# 1 A 10 1
# 2 B 10 1
# 3 C 9 2
# 4 D 8 3
Ranking continuous data in decreasing order
Simply give a minus before df0
:
t(apply(-df0, 1, rank))
# x1 x2 x3 x4
#[1,] 4 2 3 1
#[2,] 4 3 2 1
#[3,] 1 3 2 4
#[4,] 3 4 2 1
#[5,] 1 4 2 3
Difference between sort(), rank(), and order()
sort()
sorts the vector in an ascending order.
rank()
gives the respective rank of the numbers present in the vector, the smallest number receiving the rank 1.
order()
returns the indices of the vector in a sorted order.
for example: if we apply these functions are applied to the vector - c (3, 1, 2, 5, 4)
sort(c (3, 1, 2, 5, 4))
will give c(1,2,3,4,5)
rank(c (3, 1, 2, 5, 4))
will give c(3,1,2,5,4)
order(c (3, 1, 2, 5, 4))
will give c(2,3,1,5,4).
if you put these indices in this order, you will get the sorted vector. Notice how v[2] = 1, v[3] = 2, v[1] = 3, v[5] = 4 and v[4] = 5
also there is a tie handling method in R. If you run rank(c (3, 1, 2, 5, 4, 2))
it will give Rank 1 to 1, since there are two 2 present R will rank them on 2 and 3 but assign Rank 2.5 to each of them, next 3 will get Rank 4.0, so
rank(c (3, 1, 2, 5, 4, 2))
will give you output [4.0 1.0 2.5 6.0 5.0 2.5]
Hope this is helpful.
in r, how to rank dataframe some columns descending and other columns ascending based on lists elements?
Another option is to use map
to do this simultaneously by creating a column of 1, -1s
library(dplyr)
library(tidyr)
library(purrr)
library(stringr)
tibble(col1 = list(Asclist1, Deslist2), col2 = c(1, -1)) %>%
unnest_longer(col1) %>%
group_split(col2) %>%
map_dfc(~ DF1 %>%
mutate(tmp = first(.x$col2)) %>%
select(one_of(.x$col1), tmp) %>%
transmute_at(vars(-tmp), list(rank = ~rank(tmp * .)))) %>%
bind_cols(DF1, .)
# name sex age grade income score age_rank grade_rank income_rank score_rank
#1 john m 99 96 59 99 1.0 1 4 5.0
#2 adam m 46 46 36 46 3.0 4 3 3.0
#3 leo m 23 63 93 23 4.5 2 5 1.5
#4 lena f 54 54 34 54 2.0 3 2 4.0
#5 Di f 23 23 23 23 4.5 5 1 1.5
#Warning message:
#Unknown columns: `spending`
It would also notify the unknown columns as a warning
Update
If there is a single column with transmute_at
, it would not add the name in list
as suffix. To bypass that, we can create a function with rename_if
f1 <- function(dat) {
nm1 <- setdiff(names(dat), "tmp")
n1 <- length(nm1)
dat %>%
transmute_at(vars(-tmp), list(rank = ~rank(tmp * .))) %>%
rename_if(rep(n1 == 1, n1), ~ str_c(nm1, "_", .))
}
tibble(col1 = list(Asclist1, Deslist2), col2 = c(1, -1)) %>%
unnest_longer(col1) %>%
group_split(col2) %>%
map_dfc(~ DF1 %>%
mutate(tmp = first(.x$col2)) %>%
select(one_of(.x$col1), tmp) %>%
f1(.)) %>%
bind_cols(DF1, .)
# CAT PN SP Quantity Price amount amount_rank Quantity_rank Price_rank
# 1 sweets gum trident 23 10 23 9.5 3.5 1
# 2 sweets gum clortes 34 20 34 6.0 7.0 3
# 3 sweets biscuits loacker 23 26 23 9.5 3.5 6
# 4 sweets biscuits tuc 23 22 23 9.5 3.5 4
# 5 sweets choc aftereight 54 51 54 3.0 10.0 9
# 6 sweets choc lindt 32 52 32 7.0 6.0 10
# 7 drinks hotdrinks tea 45 45 45 4.0 9.0 8
# 8 drinks hotdrinks green tea 23 23 23 9.5 3.5 5
# 9 drinks juices orange 12 12 12 12.0 1.0 2
# 10 drinks juices mango 56 56 56 2.0 11.0 11
# 11 drinks energydrinks powerhorse 76 76 76 1.0 12.0 12
# 12 drinks energydrinks redbull 43 43 43 5.0 8.0 7
Related Topics
Specifying the Colour Scale for Maps in Ggplot
Naive Bayes in Quanteda VS Caret: Wildly Different Results
Creating Shiny Reactive Variable That Indicates Which Widget Was Last Modified
Rstudio Shiny Not Able to Use Ggvis
Shiny - How to Change the Font Size in Select Tags
How to Write a Data-Frame with One Column a List to a File
How to Convert a Hex String to Text in R
Paste Several Column Values into One Value in R
Why Do Rapply and Lapply Handle Null Differently
Forest Plot with Table Ggplot Coding
Making Binned Scatter Plots for Two Variables in Ggplot2 in R
How to Convert List of List into a Tibble (Dataframe)
How to Plot Histogram/ Frequency-Count of a Vector with Ggplot
How to Write an Xts Object Using Write.CSV in R
How to Draw a Contour Plot When Data Are Not on a Regular Grid