Addressing X and Y in Aes by Variable Number

Addressing x and y in aes by variable number

A variation on @Shadow's answer using new features from ggplot2 V3.0.0 :

showplot <- function(indata, inx, iny){
nms <- names(indata)
x <- nms[inx]
y <- nms[iny]
p <- ggplot(indata, aes(x = !!ensym(x), y = !!ensym(y)))
p + geom_point(size=4, alpha = 0.5)
}

testdata <- data.frame(v1=rnorm(100), v2=rnorm(100), v3=rnorm(100), v4=rnorm(100), v5=rnorm(100))
names(testdata) <- c("a-b", "c-d", "e-f", "g-h", "i-j")
showplot(indata=testdata, inx=2, iny=3)

ensym creates a symbol from the string contained in a variable (so we first have to create those variables at the start of the function), then !! unquotes it, which means it will work as if you had fed the function raw names.

!! works only in the context of functions designed to support it, usually tidyverse functions, else it just means "not not" (similar to as.logical)..

R pass variable column indices to ggplot2

You can use the aes_string in stead of aes to pass string in stead of using objects, i.e.:

myplot = function(df, x_string, y_string) {
ggplot(df, aes_string(x = x_string, y = y_string)) + geom_point()
}
myplot(df, "A", "B")
myplot(df, "B", "A")

Specifying column with its index rather than name

As pointed out by @baptiste you should use aes_string() instead of aes() to use strings in defining x and y values. Also you should put value and variable inside quotes.

PropBarPlot<-function(df, mytitle=""){
melteddf<-melt(df, id=names(df)[1], na.rm=T)
ggplot(melteddf, aes_string(x=names(df)[1],y= "value", fill="variable")) +
geom_bar(position="fill") +
theme(axis.text.x = element_text(angle=90, vjust=1)) +
labs(title=mytitle)
}

How to use column names starting with numbers in ggplot functions

One way to do this is a combination of aes_ and as.name():

plot_boxplot <- function(data, group, value){
data = data[c(group, value)]
data[,group] = as.factor(data[,group])

plot <- ggplot(data, aes_(x= as.name(group), y=as.name(value))) +
geom_boxplot()

return(plot)
}

And passing in strings for group and value:

plot_boxplot(dfx, "1ev", "2ev")

Sample Image

It's not the same plot you show above, but it looks to align with the data.

variable use in dplyr and ggplot

aes_string has been deprecated and the preferred way now is to use .data pronoun which can also be used in filter.

library(dplyr)
library(ggplot2)

remove_col <- "carb"
remove_val <- 4

x_value <- "mpg"
y_value <- "hp"

data %>%
filter(.data[[remove_col]] != remove_val ) %>%
ggplot() + geom_point(aes(x = .data[[x_value]], y = .data[[y_value]],
color = .data[[remove_col]])) +
ggtitle("Variables for `geom_point with aes` and for value to remove from `carb`")

You can also use sym with !! :

data %>% 
filter(!!sym(remove_col) != remove_val ) %>%
ggplot() + geom_point(aes(x = !!sym(x_value), y = !!sym(y_value), color = !!sym(remove_col))) +
ggtitle("Variables for `geom_point with aes` and for value to remove from `carb`")

Variable column names in the pipe

In order to pass character strings for variable names, you have to use the standard evaluation version of each function. It is aes_string for aes, and filter_ for filter. See the NSE vignette for more details.

Your function could look like:

test <- function(colname) {
tibble %>%
filter_(.dots= paste0(colname, "!= 16")) %>%
ggplot(aes_string(x = "first_column", y = colname)) +
geom_line()
}

How to override an aes color (controlled by a variable) based on a condition?

You could use scale_color_manual using a custom created palette where your level of interest (in your example where test equals 5) is set to black. Below I use palettes from RColorBrewer, extend them if necessary to the number of levels needed and sets the last color to black.

library(RColorBrewer) # provides several great palettes

createPalette <- function(n, colors = 'Greens') {
max_colors <- brewer.pal.info[colors, ]$maxcolors # Get maximum colors in palette
palette <- brewer.pal(min(max_colors, n), colors) # Get RColorBrewer palette
if (n > max_colors) {
palette <- colorRampPalette(palette)(n) # make it longer i n > max_colros
}

# assume that n-th color should be black
palette[n] <- "#000000"

# return palette
palette[1:n]
}

# create a palette with 5 levels using the Spectral palette
# change from 5 to the needed number of levels in your real data.
mypalette <- createPalette(5, 'Spectral') # palettes from RColorBrewer

We can then use mypalette with scale_color_manual(values=mypalette) to color points and lines according to the test variable.

Please note that I have updated geom_point and stat_smooth to so that they use aes(color=as.factor(test)). I have also changed the call to power_eqn to only use data points where df$test==5. The black points, lines and equation should now be based on the same data.

plot1 <- ggplot(df, aes(x = as.numeric(reorder(xdata,-ydata)), y = ydata )) + 
geom_point(aes(color=as.factor(test)), shape=1) +
stat_smooth(aes(color=as.factor(test)), method = 'nls', formula = 'y~a*x^b', method.args = list(start= c(a =1,b=1)),se=FALSE, fullrange=TRUE) +
geom_text(x = quantile(df$xdata)[4], y = max(df$ydata), label = power_eqn(df[df$test == 5,]), parse = TRUE, size=4, color="black") +
theme(legend.position = "none", axis.ticks.x = element_blank() ) +
labs( x = "xdata", y = "ydata", title="test" ) +
scale_color_manual(values = mypalette)

plot1

See resulting figure here (not reputation enough to include them)

I hope you find my answer useful.

How to make a bar-chart by using two variables on x-axis and a grouped variable on y-axis?

After speaking to the OP I found his data source and came up with this solution. Apologies if it's a bit messy, I have only been using R for 6 months. For ease of reproducibility I have preselected the variables used from the original dataset.

data <- structure(list(wkhtot = c(40, 8, 50, 40, 40, 50, 39, 48, 45, 
16, 45, 45, 52, 45, 50, 37, 50, 7, 37, 36), happy = c(7, 8, 10,
10, 7, 7, 7, 6, 8, 10, 8, 10, 9, 6, 9, 9, 8, 8, 9, 7), stflife = c(8,
8, 10, 10, 7, 7, 8, 6, 8, 10, 9, 10, 9, 5, 9, 9, 8, 8, 7, 7)), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))

Here are the packages required.

require(dplyr)
require(ggplot2)
require(tidyverse)

Here I have manipulated the data and commented my reasoning.

data <- data %>%
select(wkhtot, happy, stflife) %>% #Select the wanted variables
rename(Happy = happy) %>% #Rename for graphical sake
rename("Life Satisfied" = stflife) %>%
na.omit() %>% # remove NA values
group_by(WorkingHours = cut(wkhtot, c(-Inf, 27, 32,36,42,Inf))) %>% #Create the ranges
select(WorkingHours, Happy, "Life Satisfied") %>% #Select the variables again
pivot_longer(cols = c(`Happy`, `Life Satisfied`), names_to = "Criterion", values_to = "score") %>% # pivot the df longer for plotting
group_by(WorkingHours, Criterion)

data$Criterion <- as.factor(data$Criterion) #Make criterion a factor for graphical reasons

A bit more data prep

# Creating the percentage
data.plot <- data %>%
group_by(WorkingHours, Criterion) %>%
summarise_all(sum) %>% # get the sums for score by working hours and criterion
group_by(WorkingHours) %>%
mutate(tot = sum(score)) %>%
mutate(freq =round(score/tot *100, digits = 2)) # get percentage

Creating the plot.

# Plotting
ggplot(data.plot, aes(x = WorkingHours, y = freq, fill = Criterion)) +
geom_col(position = "dodge") +
geom_text(aes(label = freq),
position = position_dodge(width = 0.9),
vjust = 1) +
xlab("Working Hours") +
ylab("Percentage")

Please let me know if there is a more concise or easier way!!

B

DataSource: https://www.europeansocialsurvey.org/downloadwizard/?fbclid=IwAR2aVr3kuqOoy4mqa978yEM1sPEzOaghzCrLCHcsc5gmYkdAyYvGPJMdRp4



Related Topics



Leave a reply



Submit