What Type of Graph Is This? and Can It Be Created Using Ggplot2

What type of graph is this? And can it be created using ggplot2?

Use geom_path, i.e.

libraray(ggplot2)
ggplot(data, aes(x = Weekly_Hours_Per_Person, y = GDP_Per_Hour)) +
geom_point() +
geom_path()

How to draw the following graph in R? And what are these types of graphs called?

This solution plots sets of three bar graphs facetted by x. The height of the bars within each set is determined using the remainder from dividing value by 3. Horizontal spacing is provided by natural geom spacing. Vertical spacing is created using white gridlines.

library(ggplot2)
library(reshape2)

Data

dataset <- data.frame('x' = 1:5, 'value' = c(10, 5, 7, 4, 12))

Since every value is supposed to be represented by three bars, we will add 3 columns to the dataset and distribute the magnitude of the value among them using integer division:

dataset[, c('col1', 'col2', 'col3')] <- floor(dataset$value / 3)
r <- dataset$value %% 3
dataset[r == 1, 'col1'] <- dataset[dataset$value %% 3 == 1, 'col1'] + 1
dataset[r == 2, c('col1', 'col2')] <- dataset[r == 2, c('col1', 'col2')] + 1

Now, we will melt the dataframe for the purposes of plotting:

dataset <- melt(dataset, id.vars = c('x', 'value'))
colnames(dataset)[4] <- 'magnitude' # avoiding colnames conflict
dataset$variable <- as.character(dataset$variable) # column ordering within a facet

Plot

First, we will make a regular bar graph. We can move facet labels to the bottom of the plot area using the switch parameter.

plt <- ggplot(data = dataset)
plt <- plt + geom_col(aes(x=variable, y = magnitude), fill = 'black')
plt <- plt + facet_grid(.~x, switch="both")

Then we will use theme_minimal() and add a few tweaks to the parameters that govern the appearance of gridlines. Specifically, we will make sure that minor XY gridlines and major X gridlines are blank, whereas major Y gridlines are white and plotted on top of the data.

plt <- plt + theme_minimal()
plt <- plt + theme(panel.grid.major.x = element_blank(),
panel.grid.major.y = element_line(colour = "white", size = 1.5),
panel.grid.minor = element_blank(),
panel.ontop = TRUE)

We can add value labels using geom_text(). We will only use x values from col2 records such that we're not plotting the value over each bar within each set (col2 happens to be the middle bar).

plt <- plt + geom_text(data = dataset[dataset$variable == 'col2', ], 
aes(label = value, x = variable, y = magnitude + 0.5))
plt <- plt + theme(axis.text.x=element_blank()) # removing the 'col' labels
plt + xlab('x') + ylab('value')

Sample Image

Can this chart be created in R using ggplot2?

Yes you can. I think yours year labels aren't correct. Check my plot:

Sample Image

Here's the code that generates the plot:

library(tidyverse)

df1 %>%
gather(year, value, X2016:X2019) %>%
mutate(JANUARY = JANUARY %>% fct_rev() %>% fct_relevel('D-150')) %>%
group_by(JANUARY) %>%
mutate(y_pos = min(value) / 2) %>%
ggplot(aes(
x = JANUARY,
y = value,
fill = JANUARY,
group = year
)) +
geom_col(
position = position_dodge(.65),
width = .5
) +
geom_text(aes(
y = value + max(value) * .03,
label = round(value * 100) %>% str_c('%')
),
position = position_dodge(.65)
) +
geom_text(aes(
y = y_pos,
label = str_remove(year, 'X')
),
color = 'white',
angle = 90,
fontface = 'bold',
position = position_dodge(.65)
) +
scale_y_continuous(
breaks = seq(0, .9, .1),
labels = function(x) round(x * 100) %>% str_c('%')
) +
scale_fill_manual(values = c(
rgb(47, 85, 151, maxColorValue = 255),
rgb(84, 130, 53, maxColorValue = 255),
rgb(244, 177, 131, maxColorValue = 255),
rgb(112, 48, 160, maxColorValue = 255)
)) +
theme(
plot.title = element_text(hjust = .5),
panel.background = element_blank(),
panel.grid.major.y = element_line(color = rgb(.9, .9, .9)),
axis.ticks = element_blank(),
legend.position = 'none'
) +
xlab('') +
ylab('') +
ggtitle('Month of JANUARY')

Create graphs by group using ggplot in R

To achieve your desired result

  1. Split your dataframe by group using e.g. split
  2. Use lapply to loop over the list of splitted data frames to create your plots or if you want to add the group labels to the title you could loop over names(df_split).

Note: I converted the id variable to factor. Also, you have to map id on the group aesthetic to get lines per group. However, as your x variable is a numeric there is actually no need for the group aesthetic.

library(ggplot2)

df_split <- split(df, df$group)

lapply(df_split, function(df) {
ggplot(df, aes(x = x, y = y, group = id, color = factor(id))) +
geom_line()
})
lapply(names(df_split), function(i) {
ggplot(df_split[[i]], aes(x = x, y = y, group = id, color = factor(id))) +
geom_line() +
labs(title = paste("group =", i))
})
#> [[1]]

Sample Image

#> 
#> [[2]]

Sample Image

And even I if would recommend to use lapply the same could be achieved using a for loop like so:

for (i in names(df_split)) {
print(
ggplot(df_split[[i]], aes(x = x, y = y, group = id, color = factor(id))) +
geom_line() +
labs(title = paste("group =", i))
)
}

Combining Different Types of Graphs Together (R)

The code you posted above fails because you are trying to use the variable n but have not assigned the data anywhere after your summarise(n = n()) step for your pie chart data.

You can either pipe the summarised data straight into ggplot or otherwise you must assign the intermediary steps with something like this;

Pie_2014 <- data %>% 
filter((data$year == "2014")) %>%
group_by(group) %>%
summarise(n = n())

Pie_2014_graph = ggplot(Pie_2014, aes(x="", y=n, fill=group)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +ggtitle( "Pie Chart 2014")

Pie_2015 <- data %>%
filter((data$year == "2015")) %>%
group_by(group) %>%
summarise(n = n())

Pie_2015_graph = ggplot(Pie_2015, aes(x="", y=n, fill=group)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +ggtitle( "Pie Chart 2015")

Pie_total = data %>%
group_by(group) %>%
summarise(n = n())

Pie_total_graph = ggplot(Pie_total, aes(x="", y=n, fill=group)) +
geom_bar(stat="identity", width=1) +
coord_polar("y", start=0) +ggtitle( "Pie Chart Average")

After that arranging the subplots together is pretty straightforward with the patchwork package. e.g. something like this will get you close;

# combine plots

# install.packages('patchwork')
library(patchwork)

(Pie_2014_graph | Pie_2015_graph | Pie_total_graph) /
(Bar_years_plot | Bar_total_plot) /
(ts_1 | ts_2)

EDIT: Following request for a non-patchwork alternative, here is a version to get you started using cowplot:

library(cowplot)

# arrange subplots in rows
top_row <- plot_grid(Pie_2014_graph, Pie_2015_graph, Pie_total_graph, nrow = 1)
middle_row <- plot_grid(Bar_years_plot, Bar_total_plot)
bottom_row <- plot_grid(ts_1, ts_2)

# arrange our new rows into combined plot
p <- plot_grid(top_row, middle_row, bottom_row, nrow = 3)
p

Create a side-by-side bar chart using ggplot2

It is easier to plot once you get the data in long format.

library(ggplot2)

long_data <- tidyr::pivot_longer(bank, cols = -Income_Category)
long_data$Income_Category <- factor(long_data$Income_Category,
c('Less than $40K', '$40K - $60K', '$60K - $80K', '$80K - $120K', '$120K +'))

Here are some options for plotting.

1.

ggplot(long_data) + aes(name, value, fill = Income_Category) + geom_col()

Sample Image

2.

ggplot(long_data) + aes(Income_Category, value, fill = name) + geom_col()

Sample Image

3.

ggplot(long_data) + aes(Income_Category, value, fill = name) + geom_col(position = 'dodge') 

Sample Image

4.

ggplot(long_data) + aes(Income_Category, value) + geom_col() + facet_wrap(~name)

Sample Image


data

It is easier to help if you share data in a format which is easier to copy.

bank <- data.frame(Income_Category = c('$120K +', '$40K - $60K', '$60K - $80K', '$80K - $120K', 'Less than $40K'), 
"Attrited Customer" = c(126, 271, 189, 242, 612),
"Existing Customer" = c(601, 1519, 1213, 1293, 29))


Related Topics



Leave a reply



Submit