Using the result of summarise (dplyr) to mutate the original dataframe
As @beetroot points out in the comments, you can accomplish this with a join:
limits = span %>%
group_by(YEAR) %>%
summarise(minDOY=min(DOY[DLS]),maxDOY=max(DOY[DLS])) %>%
inner_join(span, by='YEAR')
# YEAR minDOY maxDOY date DOY DLS
# 1 2000 93 303 2000-01-01 00:00:00 1 FALSE
# 2 2000 93 303 2000-01-01 01:00:00 1 FALSE
# 3 2000 93 303 2000-01-01 02:00:00 1 FALSE
# 4 2000 93 303 2000-01-01 03:00:00 1 FALSE
# 5 2000 93 303 2000-01-01 04:00:00 1 FALSE
# 6 2000 93 303 2000-01-01 05:00:00 1 FALSE
# 7 2000 93 303 2000-01-01 06:00:00 1 FALSE
# 8 2000 93 303 2000-01-01 07:00:00 1 FALSE
# 9 2000 93 303 2000-01-01 08:00:00 1 FALSE
# 10 2000 93 303 2000-01-01 09:00:00 1 FALSE
Use of mutate in Summarise function using R
I find the use of mutate inside summarize very confusing, and don't really know what to expect of it (I'm honestly surprised it even works). If I understand correctly, what you want to do is best expressed as (Scenario - 3):
data %>%
group_by(identifier) %>%
summarize(shift_back_max = - min(shift_back_max, na.rm = TRUE),
shift_forward_max = min(shift_forward_max, na.rm = TRUE)) %>%
ungroup() %>%
mutate(across(starts_with("shift"), ~ ifelse(is.infinite(.x), 30 * sign(.x), .x))))
(meaning you first summarize by identifier, then you apply a treatment to the whole result)
You can compare results of the different approaches with all.equal()
. I'd expect all these approaches to give the same result, but not to be as clear to the reader.
After summarize, reinsert calculated values into original dataframe dplyr
If you are just wanting to get the mean of any group that has more than 1 row, then you don't need to separate out, since nothing will happen to just one row in a group. Here, I add max
for variable_2
, so that it only returns one value and so it is retained in the output.
library(tidyverse)
df %>%
group_by(id,variable_1,place) %>%
dplyr::summarise(value = mean(value), variable_2 = max(variable_2))
Output
id variable_1 place value variable_2
<chr> <chr> <chr> <dbl> <chr>
1 01_01 a Australia 0.6 cat
2 01_02 a France 0.8 pig
3 01_03 a Belguim 0.2 dog
4 01_04 a Germany 1.7 chicken
Or if you do want to have it broken up, then you can just add an additional summary for variable_2
, so that it doesn't get removed.
df2 <- df %>%
group_by(id,variable_1,place) %>%
filter(n()==2) %>%
dplyr::summarise(value = mean(value), variable_2 = max(variable_2))
df <- df %>%
group_by(id,variable_1,place) %>%
filter(n()==1) %>%
bind_rows(., df2)
Mutate a grouped value (like a conditional mean)
Use the group_by
before the mutate
to create the mean
column by group - instead of creating a summarise
d dataset and then joining to original data
library(dplyr)
mtcars %>%
group_by(cyl, carb) %>%
mutate(var1 = mean(mpg)) %>%
ungroup %>%
head
Create new column for mean by group in original dataframe in R
We can use mutate
instead of summarise
library(dplyr)
df <- df %>%
group_by(unit_id) %>%
mutate(mean = mean(outcome))
Adding Summarized Fields to Data Frame R
You may combine the two summary outputs.
library(dplyr)
bind_rows(df %>%
group_by(Description)%>%
summarise(Amt=sum(Amount)),
df %>%
group_by(Category)%>%
summarise(Amt=sum(Amount)) %>%
rename(Description = Category)) %>%
arrange(Description)
# Description Amt
# <chr> <dbl>
# 1 A 4700
# 2 A.a 900
# 3 A.b 1200
# 4 A.c 2600
# 5 B 7400
# 6 B.a 3500
# 7 B.b 3000
# 8 B.c 400
# 9 C 1220
#10 C.a 1580
#11 C.b 50
#12 C.c 90
How to change an element of the original dataframe using dplyr
Using the function case_when
:
library('tibble')
df <- tibble(
ticker = c("first", "second", "third"),
status = c(T,T,T)
)
library(tidyverse)
df %>%
mutate(status = case_when(
ticker == "first" ~ F,
TRUE ~ T
))
This is the output:
# A tibble: 3 x 3
ticker status `case_when(ticker == "first" ~ F, TRUE ~ T)`
<chr> <lgl> <lgl>
1 first TRUE FALSE
2 second TRUE TRUE
3 third TRUE TRUE
Related Topics
How to Get The R Shiny Downloadhandler Filename to Work
Spread with Duplicate Identifiers for Rows
Total of a Column in Dt Datatables in Shiny
Change Distance Between X-Axis Ticks in Ggplot2
Using Inst/Extdata with Vignette During Package Checking R 2.14.0
Remove Blank Lines from Plot Geom_Tile Ggplot
Adding Text Labels to Tmap Plot
Convert Latitude/Longitude to State Plane Coordinates
Rselenium on Docker: Where Are Files Downloaded
Extract Sub- and Superdiagonal of a Matrix in R
Combining Date and Time into a Date Column for Plotting
Is There an Efficient Way to Parallelize Mapply
How Does The Subset Argument Work in The Lm() Function