Programming with Ggplot2 and Dplyr

Programming with ggplot2 and dplyr

ggplot2 v3.0.0 released in July 2018 supports !! (bang bang), !!!, and :=.

facet_wrap() and facet_grid() support vars() inputs. The first two arguments of facet_grid() become rows and cols. facet_grid(vars(cyl), vars(am, vs)) is equivalent to facet_grid(cyl ~ am + vs) and facet_grid(cols = vars(am, vs)) is equivalent to facet_grid(. ~ am + vs).

So your example can be modified as follow:

library(rlang)
library(tidyverse)

foo <- function(df, y, gr, t=4) {
y <- enquo(y)
gr <- enquo(gr)

df %>%
filter(!!y > t) %>%
ggplot(aes(!!y)) +
geom_histogram() +
facet_grid(cols = vars(!!gr))
}

foo(mtcars, y= cyl, gr= vs)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Sample Image

Created on 2018-04-04 by the reprex package (v0.2.0).

How to combine ggplot and dplyr into a function?

ggplot does not yet support tidy eval syntax (you can't use the !!). You need to use more traditional standard evaluation calls. You can use aes_q in ggplot to help with this.

get_charts1 <- function(data, mygroup){

quo_var <- enquo(mygroup)

df_agg <- data %>%
group_by(!!quo_var) %>%
summarize(mean = mean(value, na.rm = TRUE),
count = n()) %>%
ungroup()

ggplot(df_agg, aes_q(x = quote(count), y = quote(mean), color = quo_var, group = quo_var)) +
geom_point() +
geom_line()
}

get_charts1(dataframe, group)

variable use in dplyr and ggplot

aes_string has been deprecated and the preferred way now is to use .data pronoun which can also be used in filter.

library(dplyr)
library(ggplot2)

remove_col <- "carb"
remove_val <- 4

x_value <- "mpg"
y_value <- "hp"

data %>%
filter(.data[[remove_col]] != remove_val ) %>%
ggplot() + geom_point(aes(x = .data[[x_value]], y = .data[[y_value]],
color = .data[[remove_col]])) +
ggtitle("Variables for `geom_point with aes` and for value to remove from `carb`")

You can also use sym with !! :

data %>% 
filter(!!sym(remove_col) != remove_val ) %>%
ggplot() + geom_point(aes(x = !!sym(x_value), y = !!sym(y_value), color = !!sym(remove_col))) +
ggtitle("Variables for `geom_point with aes` and for value to remove from `carb`")

Ggplot subset data functions and dplyr

Upon reading the documentation of %>% better, I have found the solution:

Using the dot-place holder as lhs
When the dot is used as lhs, the result will be a functional sequence, i.e. a function which applies the entire chain of right-hand sides in turn to its input. See the examples.

Therefore, the nicest way to formulate the above example, incorporating the suggestions from above as well:

db <- diamonds
template <- ggplot(db, aes(x=carat, y=price, color=cut)) +
geom_point() +
geom_smooth(data=. %>% filter(color=="J")) +
labs(caption="Smooths only for J color")
ggsave( template, "global.png" )
db %>% group_by(cut) %>% do(
ggsave( paste0(.$cut[1], ".png"), plot=template %+% .)
)

Why do you have to use . when combining dplyr with ggplot?

No, you don't need to use ., just like this

fulldata %>% ggplot(aes(x=FLYTT))+geom_bar()+coord_flip()

Put dplyr & ggplot in Loop/Apply

You were probably trying to do :

library(dplyr)
library(rlang)

cols <- c('col1', 'col2')
plot_list <- lapply(cols, function(i)
data %>%
group_by(!!sym(i), ID) %>%
summarise(Rev = sum(TotalRevenue)) %>%
ggplot(aes(x = AgeGroup,y = Rev,fill = AgeGroup)) +
geom_col(alpha = 0.9) + theme_minimal())

This will return you list of plots which can be accessed as plot_list[[1]], plot_list[[2]] etc. Also look into facets to combine multiple plots.

dplyr and ggplot in a function: use reorder in aes function

If you are going to use aes_string, then the whole value must be a string, not just partially a string. You can use paste() to help build the expression you want to use for x. For example

f <-  function(the_data, the_column){
dat %>% group_by_(the_column) %>%
tally(sort = TRUE) %>%
ggplot(aes_string(x = paste0("reorder(",the_column,", n)"), y = 'n')) +
geom_bar(stat = "identity")
}

Or you could use expressions rather than strings

f <-  function(the_data, the_column){
dat %>% group_by_(the_column) %>%
tally(sort = TRUE) %>%
ggplot(aes_q(x = substitute(reorder(x, n),list(x=as.name(the_column))), y = quote(n))) +
geom_bar(stat = "identity")
}

but the general idea is that you need to be careful when mixing strings and raw language elements (like names or expressions).

Functional programming with Tidyr's gather and Ggplot2 for More Rapid Visual Data Exploration

As long as you are looking at the same set of columns, you can wrap this in a function by converting to the standard evaluation versions of each of the functions you use. Here is your same code, just tweaked to run in a function:

plotLook <- function(thisCol = "happy"){
happy %>%
select(sex, happy, marital, health, degree) %>%
gather_("key", "value"
, names(.)[names(.) != thisCol]
)%>%
count_(c(thisCol, "key", "value")) %>%
na.omit() %>%
mutate(perc=round(n/sum(n),2)) %>%
ggplot() +
geom_col(aes_string(x="value",y="perc",fill=thisCol)) +
facet_wrap(~key,scales="free") +
geom_text(aes_string(x="value",y="perc"
,label="perc",group= thisCol)
,position=position_stack(vjust=.05))
}

Now this:

plotLook("sex")

generates:

Sample Image

Even better, you can then use lapply to generate all of the plots in one go:

lapply(c("sex", "happy", "marital", "health", "degree"), plotLook)

and either save the output to use/modify, or just let them print to the screen.



Related Topics



Leave a reply



Submit