## Applying group_by and summarise on data while keeping all the columns' info

Here are two options using a) `filter`

and b) `slice`

from dplyr. In this case there are no duplicated minimum values in column `c`

for any of the groups and so the results of a) and b) are the same. If there *were* duplicated minima, approach a) would return each minima per group while b) would only return one minimum (the first) in each group.

**a)**

`> data %>% group_by(b) %>% filter(c == min(c))`

#Source: local data frame [4 x 4]

#Groups: b

#

# a b c d

#1 1 a 1.2 small

#2 4 b 1.7 larg

#3 6 c 3.1 med

#4 10 d 2.2 med

Or similarly

`> data %>% group_by(b) %>% filter(min_rank(c) == 1L)`

#Source: local data frame [4 x 4]

#Groups: b

#

# a b c d

#1 1 a 1.2 small

#2 4 b 1.7 larg

#3 6 c 3.1 med

#4 10 d 2.2 med

**b)**

`> data %>% group_by(b) %>% slice(which.min(c))`

#Source: local data frame [4 x 4]

#Groups: b

#

# a b c d

#1 1 a 1.2 small

#2 4 b 1.7 larg

#3 6 c 3.1 med

#4 10 d 2.2 med

## How can I keep columns when grouping/summarizing?

You can do this using `base R`

`aggregate(data=df1,B~.,FUN = mean)`

## Grouping and summarizing by keeping other columns in R

Try

`summarize(MM_group, `

rank = which.max(Yield),

Year_rank = Year[rank],

County_rank = County[rank])

## Applying group_by and summarise(sum) but keep columns with non-relevant conflicting data?

Here's the `data.table`

solution, I'm assuming you want the `mean()`

of Proportion, since these grouped proportions are likely not additive.

`setDT(df)`

df[, .(Type =paste(Type,collapse="_"),

Proportion=mean(Proportion),N= sum(N),C=sum(C)), by=.(Label,Code)]

[order(Label)]

Label Code Type Proportion N C

1: 203c c wholefish 1.000000 1 1

2: 203c a flesh 1.000000 2 2

3: 204a a flesh_formula 0.499995 8 8

4: 204a b fleshdelip_formuladelip 0.499995 10 10

5: 204a c formula_wholefish 0.499995 16 16

6: 204a d formuladelip_wholefishdelip 0.499995 18 18

I'm not sure this is the cleanest `dplyr`

solution, but it works:

`df %>% group_by(Label, Code) %>% `

mutate(Type = paste(Type,collapse="_")) %>%

group_by(Label,Type,Code) %>%

summarise(N=sum(N),C=sum(C),Proportion=mean(Proportion))

Note the key here is to re-group once you create the combined `Type`

column.

` Label Type Code N C Proportion`

<fctr> <chr> <fctr> <int> <int> <dbl>

1 203c flesh a 2 2 1.000000

2 203c wholefish c 1 1 1.000000

3 204a flesh_formula a 8 8 0.499995

4 204a fleshdelip_formuladelip b 10 10 0.499995

5 204a formula_wholefish c 16 16 0.499995

6 204a formuladelip_wholefishdelip d 18 18 0.499995

## Applying group_by and summarise(sum) but keep a large number of additional columns

We can create a column with `mutate`

and then apply `distinct`

`library(dplyr)`

df %>%

group_by(location) %>%

mutate(count = sum(count)) %>% select(-date) %>%

distinct(location, important_1, important_30, .keep_all = TRUE)

If there are multiple column names, we can also use `syms`

to convert to `symbol`

and evaluate (`!!!`

)

`df %>% `

group_by(location) %>%

mutate(count = sum(count)) %>% select(-date) %>%

distinct(location, !!! rlang::syms(names(.)[startsWith(names(.), 'important')]), .keep_all = TRUE)

## keep columns after summarising using tidyverse in R

We can use `slice_max`

to return the full row based on the `max`

value of 'year' for each grouping block

`library(dplyr)`

dat %>%

group_by(group, month) %>%

slice_max(year)

### Related Topics

How to Remove Rows With Any Zero Value

Delete Rows That Exist in Another Data Frame

Count Occurrences of Value in a Set of Variables in R (Per Row)

Remove Specific Characters from Column Names in R

Selecting Multiple Odd or Even Columns/Rows for Dataframe

Divide All Columns by the Value from the 2Nd Column - Apply for All Rows

How to Control Ordering of Stacked Bar Chart Using Identity on Ggplot2

Calculate Max Value Across Multiple Columns by Multiple Groups

R: Pulling Data from One Column to Create New Columns

How to Test When Condition Returns Numeric(0) in R

Rstudio Suddenly Stopped Showing Plots in the Plot Pane

Adding Value from One Data.Frame to Another Data.Frame by Matching a Variable

To Find Most Frequently Occuring Element in Matrix in R

How to Convert Only Some Positive Numbers to Negative Numbers (Conditional Recoding)

Too Much White Space Between Caption and Figure Produced by Tikzdevice and Ggplot2 in Latex

How to Specify the Size of a Graph in Ggplot2 Independent of Axis Labels