Arrange within a group with dplyr
I think the problem in your second example is that your are using desc
on all the variables at the same time, so it is only applied to the month
column.
flights %>% group_by(month, day) %>% top_n(3, dep_delay) %>%
arrange(
month,
day,
desc(dep_delay)
)
Source: local data frame [1,108 x 19]
Groups: month, day [365]
year month day dep_time sched_dep_time dep_delay arr_time sched_arr_time arr_delay carrier flight tailnum origin
<int> <int> <int> <int> <int> <dbl> <int> <int> <dbl> <chr> <int> <chr> <chr>
1 2013 1 1 848 1835 853 1001 1950 851 MQ 3944 N942MQ JFK
2 2013 1 1 2343 1724 379 314 1938 456 EV 4321 N21197 EWR
3 2013 1 1 1815 1325 290 2120 1542 338 EV 4417 N17185 EWR
4 2013 1 2 2131 1512 379 2340 1741 359 UA 488 N593UA LGA
5 2013 1 2 1607 1030 337 2003 1355 368 AA 179 N324AA JFK
6 2013 1 2 1412 838 334 1710 1147 323 UA 468 N474UA EWR
7 2013 1 3 2056 1605 291 2239 1754 285 9E 3459 N928XJ JFK
8 2013 1 3 2008 1540 268 2339 1909 270 DL 2027 N338NW JFK
9 2013 1 3 2012 1600 252 2314 1857 257 B6 369 N558JB LGA
10 2013 1 4 2123 1635 288 2332 1856 276 EV 3805 N29917 EWR
# ... with 1,098 more rows, and 6 more variables: dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>, minute <dbl>,
# time_hour <dttm>
R, dplyr - combination of group_by() and arrange() does not produce expected result?
I think you want
ToothGrowth %>%
arrange(supp,len)
The chaining system just replaces nested commands, so first you are grouping, then ordering that grouped result, which breaks the original ordering.
Hw can I use arrange in dplyr to order groups?
Perhaps this? First, group by cyl
, then fill a new column with mean(mpg)
, which you can then arrange by however you want, and finally remove the temporary mean(mpg)
column.
mtcars %>%
group_by(cyl) %>%
mutate(mean_mpg = mean(mpg)) %>%
arrange(desc(mean_mpg)) %>%
select(-mean_mpg)
#> # A tibble: 32 x 11
#> # Groups: cyl [3]
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
#> 2 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
#> 3 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
#> 4 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1
#> 5 30.4 4 75.7 52 4.93 1.62 18.5 1 1 4 2
#> 6 33.9 4 71.1 65 4.22 1.84 19.9 1 1 4 1
#> 7 21.5 4 120. 97 3.7 2.46 20.0 1 0 3 1
#> 8 27.3 4 79 66 4.08 1.94 18.9 1 1 4 1
#> 9 26 4 120. 91 4.43 2.14 16.7 0 1 5 2
#> 10 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2
#> # ... with 22 more rows
Arrange groups in dplyr with aggregation function
Does this work:
df %>% group_by(g) %>% mutate(m = mean(x)) %>% arrange(m) %>% select(-m)
# A tibble: 7 x 2
# Groups: g [3]
g x
<dbl> <dbl>
1 2 -2
2 2 -3
3 2 -3
4 0 0
5 0 1
6 1 1
7 1 2
Arrange values within a specific group
You could do :
library(dplyr)
library(purrr)
df_nested$data <- map2(df_nested$data, df_nested$group2,~if(.y == 2)
arrange(.x, -.x$id) else .x)
So data where group2
is not equal to 2 is not sorted
df_nested$data[[1]]
# A tibble: 2 x 3
# value2 value3 id
# <dbl> <dbl> <int>
#1 13.1 -89.0 1
#2 9.76 -3.29 2
and where group2
is 2 is sorted.
df_nested$data[[4]]
# A tibble: 2 x 3
#value2 value3 id
# <dbl> <dbl> <int>
#1 15.0 -28.4 4
#2 31.0 -22.8 3
If you want to combine them do :
map2_df(df_nested$data, df_nested$group2,~if(.y == 2) arrange(.x, -.x$id) else .x)
dplyr arrange - sort groups by another column and then sort within each group
You may try
library(dplyr)
df %>%
arrange(Tot_Orders, OrderNo) %>%
group_by(Customer)
# Customer OrderNo Tot_Orders
#1 ccc 1 1
#2 bbb 1 2
#3 bbb 2 2
#4 aaa 1 3
#5 aaa 2 3
#6 aaa 3 3
df1 %>%
arrange(Tot_Orders, OrderNo) %>%
group_by(Customer)
# Customer OrderNo Tot_Orders
#1 bbb 1 1
#2 ccc 1 2
#3 ccc 2 2
#4 aaa 1 3
#5 aaa 2 3
#6 aaa 3 3
data
df <- structure(list(Customer = c("aaa", "aaa", "aaa", "bbb", "bbb",
"ccc"), OrderNo = c(2L, 1L, 3L, 2L, 1L, 1L), Tot_Orders = c(3L,
3L, 3L, 2L, 2L, 1L)), .Names = c("Customer", "OrderNo", "Tot_Orders"
), class = "data.frame", row.names = c(NA, -6L))
df1 <- structure(list(Customer = c("aaa", "aaa", "aaa", "bbb", "ccc",
"ccc"), OrderNo = c(2L, 1L, 3L, 1L, 1L, 2L), Tot_Orders = c(3L,
3L, 3L, 1L, 2L, 2L)), .Names = c("Customer", "OrderNo", "Tot_Orders"
), class = "data.frame", row.names = c(NA, -6L))
How to dplyr::arrange groups within a df by the mean of group measure?
I would try creating a variable that specifies the mean of the visCredPrcnt variable by CredentialQ group, and then pass that in the arrange call like so:
credData <- ReShapeAdCredSubset %>%
group_by(CredentialQ, year) %>%
summarise(vizCredPrcnt = (sum(credential_wIndiv, na.rm = TRUE) / (sum(credential_wAll, na.rm = TRUE)))) %>%
ungroup() %>%
group_by(CredentialQ) %>%
summarize(meanVizCredPrcnt = mean(visCredPrcnt, na.rm = T)) %>%
arrange(CredentialQ, year, desc(meanVizCredPrcnt))
dplyr - How to obtain the order of one column within a group?
library(dplyr)
tibbly = tibble(age = c(10,30,50,10,30,50,10,30,50,10,30,50),
grouping1 = c("A","A","A","A","A","A","B","B","B","B","B","B"),
grouping2 = c("X", "X", "X","Y","Y","Y","X","X","X","Y","Y","Y"),
value = c(1,2,3,4,4,6,2,5,3,6,3,2))
tibbly %>%
group_by(grouping1, grouping2) %>% # for each group
arrange(desc(value)) %>% # arrange value descending
summarise(order = paste0(age, collapse = ",")) %>% # get the order of age as a strings
ungroup() # forget the grouping
# # A tibble: 4 x 3
# grouping1 grouping2 order
# <chr> <chr> <chr>
# 1 A X 50,30,10
# 2 A Y 50,10,30
# 3 B X 30,50,10
# 4 B Y 10,30,50
Related Topics
Function to Count Na Values at Each Level of a Factor
How Could I Find The Growth Rate of Gdp
R Not Responding Request to Interrupt Stop Process
What Does The "More Columns Than Column Names" Error Mean
Split Violin Plot with Ggplot2 with Quantiles
Group Data Frame by Pattern in R
Same Seed, Different Os, Different Random Numbers in R
Read List of File Names from Web into R
How to Flatten The Data of Different Data Types by Using Sparklyr Package
Ifelse Assignment in Data.Table
Ggplot: Recommended Colour Palettes Also Distinguishable for B&W Printing
Standard Deviation on Dataframe Does Not Work
Strange Behaviour Dropping Column from Data.Frame in R
Plot Histogram with Points Instead of Bars
Using The Result of Summarise (Dplyr) to Mutate The Original Dataframe
What's The Difference Between [1], [1,], [,1], [[1]] for a Dataframe in R
Standard Error of Variance Component from The Output of Lmer