Rank variable by group (dplyr)
The following produces the desired result as was specified.
library(dplyr)
by_species <- iris %>% arrange(Species, Sepal.Length) %>%
group_by(Species) %>%
mutate(rank = rank(Sepal.Length, ties.method = "first"))
by_species %>% filter(rank <= 3)
##Source: local data frame [9 x 6]
##Groups: Species [3]
##
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species rank
## (dbl) (dbl) (dbl) (dbl) (fctr) (int)
##1 4.3 3.0 1.1 0.1 setosa 1
##2 4.4 2.9 1.4 0.2 setosa 2
##3 4.4 3.0 1.3 0.2 setosa 3
##4 4.9 2.4 3.3 1.0 versicolor 1
##5 5.0 2.0 3.5 1.0 versicolor 2
##6 5.0 2.3 3.3 1.0 versicolor 3
##7 4.9 2.5 4.5 1.7 virginica 1
##8 5.6 2.8 4.9 2.0 virginica 2
##9 5.7 2.5 5.0 2.0 virginica 3
by_species %>% slice(1:3)
##Source: local data frame [9 x 6]
##Groups: Species [3]
##
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species rank
## (dbl) (dbl) (dbl) (dbl) (fctr) (int)
##1 4.3 3.0 1.1 0.1 setosa 1
##2 4.4 2.9 1.4 0.2 setosa 2
##3 4.4 3.0 1.3 0.2 setosa 3
##4 4.9 2.4 3.3 1.0 versicolor 1
##5 5.0 2.0 3.5 1.0 versicolor 2
##6 5.0 2.3 3.3 1.0 versicolor 3
##7 4.9 2.5 4.5 1.7 virginica 1
##8 5.6 2.8 4.9 2.0 virginica 2
##9 5.7 2.5 5.0 2.0 virginica 3
Rank subgroup by group (dplyr)
We could use match
after grouping
library(dplyr)
my_df %>%
group_by(var1) %>%
mutate(group_rank = match(var2, unique(var2))) %>%
ungroup
-output
# A tibble: 20 x 3
var1 var2 group_rank
<chr> <chr> <int>
1 A long_string_x 1
2 A long_string_x 1
3 A long_string_x 1
4 A long_string_x 1
5 A long_string_y 2
6 A long_string_y 2
7 A long_string_y 2
8 A long_string_y 2
9 B long_string_x 1
10 B long_string_x 1
11 B long_string_x 1
12 B long_string_x 1
13 B long_string_y 2
14 B long_string_y 2
15 B long_string_y 2
16 B long_string_y 2
17 B long_string_z 3
18 B long_string_z 3
19 B long_string_z 3
20 B long_string_z 3
Apply a rank across groups
You could try
library(dplyr)
data %>%
group_by(Grp) %>%
mutate(Rank = Value[which.max(YEAR)]) %>%
ungroup() %>%
mutate(Rank = dense_rank(-Rank))
# YEAR Grp Value Rank
# 1 2020 A 25 3
# 2 2019 A 24 3
# 3 2020 B 35 2
# 4 2019 B 34 2
# 5 2020 C 45 1
# 6 2019 C 44 1
Add a grouping variable based on ranked data
We can use cumsum
to create the index
library(dplyr)
df %>%
mutate(event = c("Hurdles", "Long Jump")[cumsum(rank == 1)])
# name rank event
#1 Sally 1 Hurdles
#2 Dave 2 Hurdles
#3 Aaron 1 Long Jump
#4 Jane 2 Long Jump
#5 Michael 3 Long Jump
Or in base R
(just in case)
df$event <- c("Hurdles", "Long Jump")[cumsum(df$rank == 1)])
Create a ranking variable with dplyr?
It sounds like you're looking for dense_rank
from "dplyr" -- but applied in a reverse order than what rank
normally does.
Try this:
df %>% mutate(rank = dense_rank(desc(score)))
# name score rank
# 1 A 10 1
# 2 B 10 1
# 3 C 9 2
# 4 D 8 3
R data frame rank by groups (group by rank) with package dplyr
Had a similar issue, my answer was sorting on groups and the relevant ranked variable(s) in order to then use row_number() when using group_by.
# Sample dataset
df <- data.frame(group=rep(c("GROUP 1", "GROUP 2"),10),
value=as.integer(rnorm(20, mean=1000, sd=500)))
require(dplyr)
print.data.frame(df[0:10,])
group value
1 GROUP 1 1273
2 GROUP 2 1261
3 GROUP 1 1189
4 GROUP 2 1390
5 GROUP 1 1942
6 GROUP 2 1111
7 GROUP 1 530
8 GROUP 2 893
9 GROUP 1 997
10 GROUP 2 237
sorted <- df %>%
arrange(group, -value) %>%
group_by(group) %>%
mutate(rank=row_number())
print.data.frame(sorted)
group value rank
1 GROUP 1 1942 1
2 GROUP 1 1368 2
3 GROUP 1 1273 3
4 GROUP 1 1249 4
5 GROUP 1 1189 5
6 GROUP 1 997 6
7 GROUP 1 562 7
8 GROUP 1 535 8
9 GROUP 1 530 9
10 GROUP 1 1 10
11 GROUP 2 1472 1
12 GROUP 2 1390 2
13 GROUP 2 1281 3
14 GROUP 2 1261 4
15 GROUP 2 1111 5
16 GROUP 2 893 6
17 GROUP 2 774 7
18 GROUP 2 669 8
19 GROUP 2 631 9
20 GROUP 2 237 10
ranking with dplyr between groups
After ungroup
ing, use dense_rank
d %>%
group_by(group2) %>%
mutate(total_value = sum(value)) %>%
arrange(-total_value) %>%
ungroup %>%
mutate( rank = dense_rank(-total_value) )
# A tibble: 4 x 5
# group1 group2 value total_value rank
# <fct> <fct> <dbl> <dbl> <int>
#1 B f 2 6 1
#2 B f 4 6 1
#3 A e 1 4 2
#4 A e 3 4 2
How to rank within groups in R?
You can do this pretty cleanly with dplyr
library(dplyr)
df %>%
group_by(customer_name) %>%
mutate(my_ranks = order(order(order_values, order_dates, decreasing=TRUE)))
Source: local data frame [5 x 4]
Groups: customer_name
customer_name order_dates order_values my_ranks
1 John 2010-11-01 15 3
2 Bob 2008-03-25 12 1
3 Alex 2009-11-15 5 1
4 John 2012-08-06 15 2
5 John 2015-05-07 20 1
R: Get ranking of factor levels by group
Use dplyr::dense_rank
, or as.numeric(factor(Days, ordered = T))
in base R
:
df %>%
group_by(Number) %>%
mutate(Ranking = dense_rank(Days),
Ranking2 = as.numeric(factor(Days, ordered = T)))
output
# A tibble: 15 × 4
# Groups: Number [3]
Number Days Ranking Ranking2
<dbl> <dbl> <int> <dbl>
1 1 5 1 1
2 1 5 1 1
3 1 10 2 2
4 1 10 2 2
5 1 15 3 3
6 2 3 1 1
7 2 3 1 1
8 2 3 1 1
9 2 5 2 2
10 2 5 2 2
11 3 11 1 1
12 3 11 1 1
13 3 13 2 2
14 3 13 2 2
15 3 13 2 2
Related Topics
Data.Table Alternative for Dplyr Case_When
How to Display Widgets Inline in Shiny
Create a 24 Hour Vector with 5 Minutes Time Interval in R
R Table Function: How to Sum Instead of Counting
Convert Scientific Notation to Numeric, Preserving Decimals
K-Means Clustering in R on Very Large, Sparse Matrix
Filter Each Column of a Data.Frame Based on a Specific Value
Replace Missing Value with Previous Value
Determining the Distance Between Two Zip Codes (Alternatives to Mapdist)
R 3.4.1 "Single Candle" Personal Library Path Error: Unable to Create 'Na'
Arrange a Grouped_Df by Group Variable Not Working
Why Are Xs Added to Data Frame Variable Names When Using Read.Csv
Using Multiple Ellipses Arguments in R