Population Pyramid Plot with Ggplot2 and Dplyr (Instead of Plyr)

Population pyramid plot with ggplot2 and dplyr (instead of plyr)

You avoid the error by specifying the argument data in geom_bar:

ggplot(data = test, aes(x = as.factor(v), fill = g)) + 
geom_bar(data = dplyr::filter(test, g == "F")) +
geom_bar(data = dplyr::filter(test, g == "M"), aes(y = ..count.. * (-1))) +
scale_y_continuous(breaks = seq(-40, 40, 10), labels = abs(seq(-40, 40, 10))) +
coord_flip()

Simpler population pyramid in ggplot2

Here is a solution without the faceting. First, create data frame. I used values from 1 to 20 to ensure that none of values is negative (with population pyramids you don't get negative counts/ages).

test <- data.frame(v=sample(1:20,1000,replace=T), g=c('M','F'))

Then combined two geom_bar() calls separately for each of g values. For F counts are calculated as they are but for M counts are multiplied by -1 to get bar in opposite direction. Then scale_y_continuous() is used to get pretty values for axis.

require(ggplot2)
require(plyr)
ggplot(data=test,aes(x=as.factor(v),fill=g)) +
geom_bar(subset=.(g=="F")) +
geom_bar(subset=.(g=="M"),aes(y=..count..*(-1))) +
scale_y_continuous(breaks=seq(-40,40,10),labels=abs(seq(-40,40,10))) +
coord_flip()

UPDATE

As argument subset=. is deprecated in the latest ggplot2 versions the same result can be atchieved with function subset().

ggplot(data=test,aes(x=as.factor(v),fill=g)) + 
geom_bar(data=subset(test,g=="F")) +
geom_bar(data=subset(test,g=="M"),aes(y=..count..*(-1))) +
scale_y_continuous(breaks=seq(-40,40,10),labels=abs(seq(-40,40,10))) +
coord_flip()

Sample Image

age pyramid in R (using group data)

You could summarize the data beforehand and then pass it onto ggplot like below:

df1 <- df %>% group_by(gender,age) %>% summarise(s_age = sum(age))

ggplot(data = df1, aes(x = age,y=s_age, fill = gender)) +
geom_bar(data = filter(df1, gender == "F"), stat = "identity" ) +
geom_bar(data = filter(df1, gender == "M"), stat="identity", aes(y=-s_age) ) +
coord_flip()

Sample Image

drawing pyramid plot using R and ggplot2

This is essentially a back-to-back barplot, something like the ones generated using ggplot2 in the excellent learnr blog: http://learnr.wordpress.com/2009/09/24/ggplot2-back-to-back-bar-charts/

You can use coord_flip with one of those plots, but I'm not sure how you get it to share the y-axis labels between the two plots like what you have above. The code below should get you close enough to the original:

First create a sample data frame of data, convert the Age column to a factor with the required break-points:

require(ggplot2)
df <- data.frame(Type = sample(c('Male', 'Female', 'Female'), 1000, replace=TRUE),
Age = sample(18:60, 1000, replace=TRUE))

AgesFactor <- ordered( cut(df$Age, breaks = c(18,seq(20,60,5)),
include.lowest = TRUE))

df$Age <- AgesFactor

Now start building the plot: create the male and female plots with the corresponding subset of the data, suppressing legends, etc.

gg <- ggplot(data = df, aes(x=Age))

gg.male <- gg +
geom_bar( subset = .(Type == 'Male'),
aes( y = ..count../sum(..count..), fill = Age)) +
scale_y_continuous('', formatter = 'percent') +
opts(legend.position = 'none') +
opts(axis.text.y = theme_blank(), axis.title.y = theme_blank()) +
opts(title = 'Male', plot.title = theme_text( size = 10) ) +
coord_flip()

For the female plot, reverse the 'Percent' axis using trans = "reverse"...

gg.female <- gg + 
geom_bar( subset = .(Type == 'Female'),
aes( y = ..count../sum(..count..), fill = Age)) +
scale_y_continuous('', formatter = 'percent', trans = 'reverse') +
opts(legend.position = 'none') +
opts(axis.text.y = theme_blank(),
axis.title.y = theme_blank(),
title = 'Female') +
opts( plot.title = theme_text( size = 10) ) +
coord_flip()

Now create a plot just to display the age-brackets using geom_text, but also use a dummy geom_bar to ensure that the scaling of the "age" axis in this plot is identical to those in the male and female plots:

gg.ages <- gg + 
geom_bar( subset = .(Type == 'Male'), aes( y = 0, fill = alpha('white',0))) +
geom_text( aes( y = 0, label = as.character(Age)), size = 3) +
coord_flip() +
opts(title = 'Ages',
legend.position = 'none' ,
axis.text.y = theme_blank(),
axis.title.y = theme_blank(),
axis.text.x = theme_blank(),
axis.ticks = theme_blank(),
plot.title = theme_text( size = 10))

Finally, arrange the plots on a grid, using the method in Hadley Wickham's book:

grid.newpage()

pushViewport( viewport( layout = grid.layout(1,3, widths = c(.4,.2,.4))))

vplayout <- function(x, y) viewport(layout.pos.row = x, layout.pos.col = y)

print(gg.female, vp = vplayout(1,1))
print(gg.ages, vp = vplayout(1,2))
print(gg.male, vp = vplayout(1,3))

alt text

population pyramid for different value in a data frame

Maybe this is what you are looking for:

library(ggplot2)

ggplot(data=mydata, aes(x=age, y = ifelse(gender == "male", - population, population), fill=gender)) +
geom_col() +
facet_wrap(~country) +
coord_flip()

Sample Image

geom_bar ggplot2 stacked, grouped bar plot with positive and negative values - pyramid plot

Try this. Just as you position the bars with two statements (one for positive, one for negative), position the text in the same way. Then, fine-tune their positioning (inside the bar, or outside the bar) using vjust. Also, there is no 'label' variable in the data frame; the label, I assume, is value.

library(ggplot2)

## Using your df.m data frame
ggplot(df.m, aes(strain), ylim(-500:500)) +
geom_bar(data = subset(df.m, variable == "count.up"),
aes(y = value, fill = condition), stat = "identity", position = "dodge") +
geom_bar(data = subset(df.m, variable == "count.down"),
aes(y = -value, fill = condition), stat = "identity", position = "dodge") +
geom_hline(yintercept = 0,colour = "grey90")

last_plot() +
geom_text(data = subset(df.m, variable == "count.up"),
aes(strain, value, group=condition, label=value),
position = position_dodge(width=0.9), vjust = 1.5, size=4) +
geom_text(data = subset(df.m, variable == "count.down"),
aes(strain, -value, group=condition, label=value),
position = position_dodge(width=0.9), vjust = -.5, size=4) +
coord_cartesian(ylim = c(-500, 500))

Sample Image

Pyramid plot in Plotly

You can use the following code:

library(plotly)
library(dplyr)
data %>%
mutate(population = ifelse(test = gender == "M", yes = -population, no = population)) %>%
mutate(abs_pop = abs(population)) %>%
plot_ly(x= ~population, y=~age, color=~gender) %>%
add_bars(orientation = 'h', hoverinfo = 'text', text = ~abs_pop) %>%
layout(bargap = 0.1, barmode = 'overlay',
xaxis = list(tickmode = 'array', tickvals = c(-15000, -10000, -5000, 0, 5000, 10000, 15000),
ticktext = c('15000', '10000', '5000', '0', '5000', '10000', '15000')))

Output:

Sample Image



Related Topics



Leave a reply



Submit