Applying group_by and summarise on data while keeping all the columns' info
Here are two options using a) filter
and b) slice
from dplyr. In this case there are no duplicated minimum values in column c
for any of the groups and so the results of a) and b) are the same. If there were duplicated minima, approach a) would return each minima per group while b) would only return one minimum (the first) in each group.
a)
> data %>% group_by(b) %>% filter(c == min(c))
#Source: local data frame [4 x 4]
#Groups: b
#
# a b c d
#1 1 a 1.2 small
#2 4 b 1.7 larg
#3 6 c 3.1 med
#4 10 d 2.2 med
Or similarly
> data %>% group_by(b) %>% filter(min_rank(c) == 1L)
#Source: local data frame [4 x 4]
#Groups: b
#
# a b c d
#1 1 a 1.2 small
#2 4 b 1.7 larg
#3 6 c 3.1 med
#4 10 d 2.2 med
b)
> data %>% group_by(b) %>% slice(which.min(c))
#Source: local data frame [4 x 4]
#Groups: b
#
# a b c d
#1 1 a 1.2 small
#2 4 b 1.7 larg
#3 6 c 3.1 med
#4 10 d 2.2 med
How can I keep columns when grouping/summarizing?
You can do this using base R
aggregate(data=df1,B~.,FUN = mean)
Grouping and summarizing by keeping other columns in R
Try
summarize(MM_group,
rank = which.max(Yield),
Year_rank = Year[rank],
County_rank = County[rank])
Applying group_by and summarise(sum) but keep columns with non-relevant conflicting data?
Here's the data.table
solution, I'm assuming you want the mean()
of Proportion, since these grouped proportions are likely not additive.
setDT(df)
df[, .(Type =paste(Type,collapse="_"),
Proportion=mean(Proportion),N= sum(N),C=sum(C)), by=.(Label,Code)]
[order(Label)]
Label Code Type Proportion N C
1: 203c c wholefish 1.000000 1 1
2: 203c a flesh 1.000000 2 2
3: 204a a flesh_formula 0.499995 8 8
4: 204a b fleshdelip_formuladelip 0.499995 10 10
5: 204a c formula_wholefish 0.499995 16 16
6: 204a d formuladelip_wholefishdelip 0.499995 18 18
I'm not sure this is the cleanest dplyr
solution, but it works:
df %>% group_by(Label, Code) %>%
mutate(Type = paste(Type,collapse="_")) %>%
group_by(Label,Type,Code) %>%
summarise(N=sum(N),C=sum(C),Proportion=mean(Proportion))
Note the key here is to re-group once you create the combined Type
column.
Label Type Code N C Proportion
<fctr> <chr> <fctr> <int> <int> <dbl>
1 203c flesh a 2 2 1.000000
2 203c wholefish c 1 1 1.000000
3 204a flesh_formula a 8 8 0.499995
4 204a fleshdelip_formuladelip b 10 10 0.499995
5 204a formula_wholefish c 16 16 0.499995
6 204a formuladelip_wholefishdelip d 18 18 0.499995
Applying group_by and summarise(sum) but keep a large number of additional columns
We can create a column with mutate
and then apply distinct
library(dplyr)
df %>%
group_by(location) %>%
mutate(count = sum(count)) %>% select(-date) %>%
distinct(location, important_1, important_30, .keep_all = TRUE)
If there are multiple column names, we can also use syms
to convert to symbol
and evaluate (!!!
)
df %>%
group_by(location) %>%
mutate(count = sum(count)) %>% select(-date) %>%
distinct(location, !!! rlang::syms(names(.)[startsWith(names(.), 'important')]), .keep_all = TRUE)
keep columns after summarising using tidyverse in R
We can use slice_max
to return the full row based on the max
value of 'year' for each grouping block
library(dplyr)
dat %>%
group_by(group, month) %>%
slice_max(year)
Related Topics
How to Remove Rows With Any Zero Value
Delete Rows That Exist in Another Data Frame
Count Occurrences of Value in a Set of Variables in R (Per Row)
Remove Specific Characters from Column Names in R
Selecting Multiple Odd or Even Columns/Rows for Dataframe
Divide All Columns by the Value from the 2Nd Column - Apply for All Rows
How to Control Ordering of Stacked Bar Chart Using Identity on Ggplot2
Calculate Max Value Across Multiple Columns by Multiple Groups
R: Pulling Data from One Column to Create New Columns
How to Test When Condition Returns Numeric(0) in R
Rstudio Suddenly Stopped Showing Plots in the Plot Pane
Adding Value from One Data.Frame to Another Data.Frame by Matching a Variable
To Find Most Frequently Occuring Element in Matrix in R
How to Convert Only Some Positive Numbers to Negative Numbers (Conditional Recoding)
Too Much White Space Between Caption and Figure Produced by Tikzdevice and Ggplot2 in Latex
How to Specify the Size of a Graph in Ggplot2 Independent of Axis Labels