How to combine ggplot and dplyr into a function?
ggplot
does not yet support tidy eval syntax (you can't use the !!
). You need to use more traditional standard evaluation calls. You can use aes_q
in ggplot to help with this.
get_charts1 <- function(data, mygroup){
quo_var <- enquo(mygroup)
df_agg <- data %>%
group_by(!!quo_var) %>%
summarize(mean = mean(value, na.rm = TRUE),
count = n()) %>%
ungroup()
ggplot(df_agg, aes_q(x = quote(count), y = quote(mean), color = quo_var, group = quo_var)) +
geom_point() +
geom_line()
}
get_charts1(dataframe, group)
Programming with ggplot2 and dplyr
ggplot2 v3.0.0
released in July 2018 supports !!
(bang bang), !!!
, and :=
.
facet_wrap()
and facet_grid()
support vars()
inputs. The first two arguments of facet_grid()
become rows
and cols
. facet_grid(vars(cyl), vars(am, vs))
is equivalent to facet_grid(cyl ~ am + vs)
and facet_grid(cols = vars(am, vs))
is equivalent to facet_grid(. ~ am + vs)
.
So your example can be modified as follow:
library(rlang)
library(tidyverse)
foo <- function(df, y, gr, t=4) {
y <- enquo(y)
gr <- enquo(gr)
df %>%
filter(!!y > t) %>%
ggplot(aes(!!y)) +
geom_histogram() +
facet_grid(cols = vars(!!gr))
}
foo(mtcars, y= cyl, gr= vs)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Created on 2018-04-04 by the reprex package (v0.2.0).
dplyr and ggplot in a function: use reorder in aes function
If you are going to use aes_string
, then the whole value must be a string, not just partially a string. You can use paste()
to help build the expression you want to use for x
. For example
f <- function(the_data, the_column){
dat %>% group_by_(the_column) %>%
tally(sort = TRUE) %>%
ggplot(aes_string(x = paste0("reorder(",the_column,", n)"), y = 'n')) +
geom_bar(stat = "identity")
}
Or you could use expressions rather than strings
f <- function(the_data, the_column){
dat %>% group_by_(the_column) %>%
tally(sort = TRUE) %>%
ggplot(aes_q(x = substitute(reorder(x, n),list(x=as.name(the_column))), y = quote(n))) +
geom_bar(stat = "identity")
}
but the general idea is that you need to be careful when mixing strings and raw language elements (like names or expressions).
R: ggplot: Combine geom_points using dplyr with group averages
Is this what you want?
library(tidyverse)
theme_set(theme_bw())
# Mean per Species
plt1 <- iris %>%
filter(Species %in% c("setosa", "virginica")) %>%
group_by(Species) %>%
mutate(mean_length = mean(Petal.Length, na.rm = TRUE),
mean_width = mean(Petal.Width, na.rm = TRUE)) %>%
ungroup() %>%
ggplot(., aes(x=Petal.Width, y=Petal.Length, color=Species)) +
geom_point() +
facet_grid(~ Species, scales = "free_x") +
geom_hline(aes(yintercept = mean_length, linetype = c("Mean length"))) +
geom_vline(aes(xintercept = mean_width, linetype = c("Mean width")),
show.legend = FALSE)
plt1 + scale_linetype_manual(NULL,
values = c(5, 3),
labels = c("Mean length", "Mean width")) +
guides(color = guide_legend(order = 1)) +
# Move legends closer to each other
theme(
legend.spacing.y = unit(0.05, "cm"),
legend.margin = margin(0, 0, 0, 0),
legend.box.margin = margin(0, 0, 0, 0))
# Mean for all Species
plt2 <- iris %>%
filter(Species %in% c("setosa", "virginica")) %>%
mutate(mean_length = mean(Petal.Length, na.rm = TRUE),
mean_width = mean(Petal.Width, na.rm = TRUE)) %>%
ggplot(., aes(x=Petal.Width, y=Petal.Length, color=Species)) +
geom_point() +
geom_hline(aes(yintercept = mean_length, linetype = c("Mean length"))) +
geom_vline(aes(xintercept = mean_width, linetype = c("Mean width")),
show.legend = FALSE)
plt2 + scale_linetype_manual(NULL,
values = c(1, 4),
labels = c("Mean length (all)", "Mean width (all)")) +
guides(color = guide_legend(order = 1)) +
theme(
legend.spacing.y = unit(0.05, "cm"),
legend.margin = margin(0, 0, 0, 0),
legend.box.margin = margin(0, 0, 0, 0))
Created on 2018-05-23 by the reprex package (v0.2.0).
pass function arguments to both dplyr and ggplot
Tidy evaluation is now fully supported in ggplot2 v3.0.0
so it's not necessary to use aes_
or aes_string
anymore.
library(rlang)
library(tidyverse)
diamond_plot <- function (data, group, metric) {
quo_group <- sym(group)
quo_metric <- sym(metric)
data %>%
group_by(!! quo_group) %>%
summarise(price = mean(!! quo_metric)) %>%
ggplot(aes(x = !! quo_group, y = !! quo_metric)) +
geom_col()
}
diamond_plot(diamonds, "clarity", "price")
Created on 2018-04-16 by the reprex package (v0.2.0).
how to create factor variables from quosures in functions using ggplot and dplyr?
Here is a possibility using other rlang functions.
get_charts1 <- function(data, mygroup){
quo_var <- enquo(mygroup)
df_agg <- data %>%
group_by(!!quo_var) %>%
summarize(mean = mean(value, na.rm = TRUE),
count = n()) %>%
ungroup()
cc <- rlang::expr(factor(!!(rlang::get_expr(quo_var))))
# or just cc <- expr(factor(!!get_expr(quo_var))) if you include library(rlang)
ggplot(df_agg, aes_q(x = quote(count), y = quote(mean), color = cc)) +
geom_point() +
geom_line()
}
We build the expression factor(group)
using the expr()
function. We use get_expr()
to extract the symbol name "group" from the quosure quo_var
. Once we've build the expression, we can pass it on to aes_q
.
Hopefully ggplot will soon be tidy-eval-friendly and this will no longer be necessary.
dplyr do multiple plots in an anonymous function
I don't think you need the return value to be a frame. Try this:
plots <- df %>%
group_by(gene) %>%
do(plot= {
p <- ggplot(.,aes(position,score)) +
geom_point()
if (all(.$strand == "-")) p <- p + scale_y_reverse()
p
})
plots
# Source: local data frame [2 x 2]
# Groups: <by row>
# # A tibble: 2 x 2
# gene plot
# * <fct> <list>
# 1 alpha <S3: gg>
# 2 beta <S3: gg>
I think one issue is that your conditional logic is fine but you did not name the block within do(...)
.
You can view one of them with:
plots$plot[[1]]
If you want to dump all plots (e.g., in a markdown document), just do plots$plot
and they will be cycled through rather quickly (not as useful on the console).
Function with dplyr, tidyr and ggplot
The development version of tidyr, tidyr_0.6.3.9000, now uses tidyeval
, so if you want to update to that you could use !!
as you did in group_by
.
plot_nice_chart <- function(df, param_col) {
enq_param_col <- enquo(param_col)
str_param_col <- deparse(substitute(param_col))
str_param_col
df %>%
group_by(!!enq_param_col, date_col) %>%
summarise(val_col = sum(val_col)) %>%
ungroup() %>%
complete(!!enq_param_col, date_col) %>%
ggplot(aes_string("date_col", "val_col", color = str_param_col)) +
geom_line()
}
Using the current version, you can use complete_
with variables as strings.
plot_nice_chart <- function(df, param_col) {
enq_param_col <- enquo(param_col)
str_param_col <- deparse(substitute(param_col))
df %>%
group_by(!!enq_param_col, date_col) %>%
summarise(val_col = sum(val_col)) %>%
ungroup() %>%
complete_( c(str_param_col, "date_col") ) %>%
ggplot(aes_string("date_col", "val_col", color = str_param_col)) +
geom_line()
}
Combine with() and ggplot2
Your code doesn't work because the first argument of ggplot()
is data
. You need to specifically say that you want to use the argument mapping
.
df <- data.frame(a= 1:10, b= 1:10)
with(df, ggplot(mapping = aes(a, b)) + geom_point())
Or you can do this
df <- data.frame(a= 1:10, b= 1:10)
with(df, ggplot() + geom_point(aes(a, b))
The second method works because the first argument for geom_*
is mapping
.
Related Topics
Get a List of the Data Sets in a Particular Package
How to Create a Bipartite Network in R with Igraph or Tnet
Plot Data Over Background Image with Ggplot
Rstudio Is Duplicating Commands in the Command Line
Adding New Column with Conditional Values Using Ifelse
Extract File Extension from File Path
Merge Dataframes, Different Lengths
Convert 12 Hour Character Time to 24 Hour
Efficient String Similarity Grouping