Arrange a Grouped_Df by Group Variable Not Working

Arrange a grouped_df by group variable not working

Try switching the order of your group_by statement:

df %>% 
group_by(year, client) %>%
summarise(tot = sum(rev)) %>%
arrange(year, desc(tot))

I think arrange is ordering within groups; after summarize, the last group is dropped, so this means in your first example it's arranging rows within the client group. Switching the order to group_by(year, client) seems to fix it because the client group gets dropped after summarize.

Alternatively, there is the ungroup() function

df %>% 
group_by(client, year) %>%
summarise(tot = sum(rev)) %>%
ungroup() %>%
arrange(year, desc(tot))

Edit, @lucacerone: since dplyr 0.5 this does not work anymore:

Breaking changes arrange() once again ignores grouping, reverting back
to the behaviour of dplyr 0.3 and earlier. This makes arrange()
inconsistent with other dplyr verbs, but I think this behaviour is
generally more useful. Regardless, it’s not going to change again, as
more changes will just cause more confusion.

arrange() not working on grouped data frame

It's not working because you need to ungroup() the data before arranging by cyl. The code you are using attempts to order the cyl column while it's still grouped by cyl. Since those values are all the same (within each group), nothing changes.

To arrange the entire data by cyl after ranking, we need to remove the grouping first, and then we can run arrange() again.

library(dplyr)

group_by(mtcars, cyl) %>% ## group by cylinder
mutate(rank = row_number(mpg)) %>% ## rank by mpg
filter(rank <= 3) %>% ## top three for each cyl
arrange(rank) %>% ## arrange each group by rank
ungroup() %>% ## remove grouping
arrange(desc(cyl)) ## arrange all by cylinder (descending)

# mpg cyl disp hp drat wt qsec vs am gear carb rank
# 1 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 1
# 2 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 2
# 3 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 3
# 4 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 1
# 5 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 2
# 6 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 3
# 7 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 1
# 8 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 2
# 9 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 3

As a side note, I would recommend that you consider using the %>% function for chaining these calls together as it will considerably cut down on assignments made with <-.

R, dplyr - combination of group_by() and arrange() does not produce expected result?

I think you want

ToothGrowth %>%
arrange(supp,len)

The chaining system just replaces nested commands, so first you are grouping, then ordering that grouped result, which breaks the original ordering.

r - arrange values in a dataframe, within a group, based on a variable, in ascending or descending order

I think this accomplishes what you're looking for. You can add other cases to the case_when statement as well if there are more scenarios for var3 that you need to tackle. Also, for default sorting, you can add something like TRUE ~ var2 for the last case to handle an unknown value in var3.

test %>% 
group_by(var1) %>%
arrange(case_when(var3 == "i" ~ var2,
var3 == "d" ~ -var2), .by_group = T)

# A tibble: 10 x 3
# Groups: var1 [2]
var1 var2 var3
<dbl> <dbl> <fct>
1 1 1 i
2 1 2 i
3 1 3 i
4 1 5 i
5 1 9 i
6 2 9 d
7 2 8 d
8 2 7 d
9 2 5 d
10 2 3 d

How to dplyr::arrange groups within a df by the mean of group measure?

I would try creating a variable that specifies the mean of the visCredPrcnt variable by CredentialQ group, and then pass that in the arrange call like so:

credData <- ReShapeAdCredSubset %>%
group_by(CredentialQ, year) %>%
summarise(vizCredPrcnt = (sum(credential_wIndiv, na.rm = TRUE) / (sum(credential_wAll, na.rm = TRUE)))) %>%
ungroup() %>%
group_by(CredentialQ) %>%
summarize(meanVizCredPrcnt = mean(visCredPrcnt, na.rm = T)) %>%
arrange(CredentialQ, year, desc(meanVizCredPrcnt))

Arrange groups in dplyr with aggregation function

Does this work:

df %>% group_by(g) %>% mutate(m = mean(x)) %>% arrange(m) %>% select(-m)
# A tibble: 7 x 2
# Groups: g [3]
g x
<dbl> <dbl>
1 2 -2
2 2 -3
3 2 -3
4 0 0
5 0 1
6 1 1
7 1 2

Dplyr Arrange Giving Error when Sorting by more than 1 Column

You can use this code:

df <- data.frame(year = c(2022, 2017, 2021, 2022, 2020, 2019, 2017, 2022, 2020),
amount = c(5, 2, 2, 3, 4, 3, 1, 2, 1))

df %>%
arrange(desc(year))

Output:

year amount
2022 5
2022 3
2022 2
2021 2
2020 4
2020 1
2019 3
2017 2
2017 1

You should only mention one variable in desc()

Deploying arrange(desc(.)) on each variable passed previously via enquos

You can use arrange_at like this:

quick_smry <- function(df, x, ...) {
group_by_vars <- enquos(...)
check_var <- enquo(x)
df %>%
group_by(!!!group_by_vars) %>%
summarise(num_missing = sum(is.na(!!check_var))) %>%
arrange_at(group_by_vars, desc)
}

quick_smry(test_data, tst_var, t_year, t_mnth)


Related Topics



Leave a reply



Submit