Plotting Continuous and Discrete Series in Ggplot with Facet

Plotting continuous and discrete series in ggplot with facet

Problem with your data is that that for data frame subm value is numeric (continuous) but for the mcsm value is factor (discrete). You can't use the same scale for numeric and continuous values and you get y values only for the last facet (discrete). Also it is not possible to use two scale_y...() functions in one plot.

My approach would be to make mcsm value as numeric (saved as value2) and then use them - it will plot quarters as 1,2,3 and 4. To solve the problem with legend, use scale_color_discrete() and provide breaks= in order you need.

mcsm$value2<-as.numeric(mcsm$value)
ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+
facet_grid(variable~., scale='free_y') + geom_step(data=mcsm, aes(date, value2)) +
scale_color_discrete(breaks=c('psavert','uempmed','unemploy','q'))

Sample Image

UPDATE - solution using grobs

Another approach is to use grobs and library gridExtra to plot your data as separate plots.

First, save plot with all legends and data (code as above) as object p. Then with functions ggplot_build() and ggplot_gtable() save plot as grob object gp. Extract from gp only part that plots legend (saved as object gp.leg) - in this case is list element number 17.

library(gridExtra)
p<-ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+
facet_grid(variable~., scale='free_y') + geom_step(data=mcsm, aes(date, value2)) +
scale_color_discrete(breaks=c('psavert','uempmed','unemploy','q'))
gp<-ggplot_gtable(ggplot_build(p))
gp.leg<-gp$grobs[[17]]

Make two new plot p1 and p2 - first plots data of subm and second only data of mcsm. Use scale_color_manual() to set colors the same as used for plot p. For the first plot remove x axis title, texts and ticks and with plot.margin= set lower margin to negative number. For the second plot change upper margin to negative number. faced_grid() should be used for both plots to get faceted look.

p1 <- ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+
facet_grid(variable~., scale='free_y')+
theme(plot.margin = unit(c(0.5,0.5,-0.25,0.5), "lines"),
axis.text.x=element_blank(),
axis.title.x=element_blank(),
axis.ticks.x=element_blank())+
scale_color_manual(values=c("#F8766D","#00BFC4","#C77CFF"),guide="none")

p2 <- ggplot(data=mcsm, aes(date, value,group=1,col=variable)) + geom_step() +
facet_grid(variable~., scale='free_y')+
theme(plot.margin = unit(c(-0.25,0.5,0.5,0.5), "lines"))+ylab("")+
scale_color_manual(values="#7CAE00",guide="none")

Save both plots p1 and p2 as grob objects and then set for both plots the same widths.

gp1 <- ggplot_gtable(ggplot_build(p1))
gp2 <- ggplot_gtable(ggplot_build(p2))
maxWidth = grid::unit.pmax(gp1$widths[2:3],gp2$widths[2:3])
gp1$widths[2:3] <- as.list(maxWidth)
gp2$widths[2:3] <- as.list(maxWidth)

With functions grid.arrange() and arrangeGrob() arrange both plots and legend in one plot.

grid.arrange(arrangeGrob(arrangeGrob(gp1,gp2,heights=c(3/4,1/4),ncol=1),
gp.leg,widths=c(7/8,1/8),ncol=2))

Sample Image

Using ggplot2 and facet_grid for continuous and categorical variables together (R)

This is possible to do entirely within ggplot, but it's pretty hacky. Facets are really a way of showing extra dimensions of the same data set. They are not intended to be a way of arbitrarily stitching different plots together, so an entirely ggplot-based solution requires manipulating your data and the axis labels to produce the appearance of stitching plots together.

First, we get the unique levels of the barplot variables as character strings:

levs    <- sort(unique(c(as.character(f$var_2), as.character(f$var_3))))

Now, we convert the factors to numbers:

f$var_2 <- as.numeric(factor(f$var_2, levs)) + ceiling(max(f$var_1)) + 10
f$var_3 <- as.numeric(factor(f$var_3, levs)) + ceiling(max(f$var_1)) + 10

We will now construct the breaks and labels that we will use for our x axis

breaks  <- c(pretty(range(f$var_1)), sort(unique(c(f$var_2, f$var_3))))
labs <- c(pretty(range(f$var_1)), levs)

Now we can safely pivot our data frame:

f <- pivot_longer(f, cols = c("var_1", "var_2", "var_3")) 

For our plot, we will use appropriately subsetted groups from the data frame for the density plot and the bar plots. We then facet with free scales and label the x axis with our pre-defined breaks and labels:

ggplot(f, aes(x = value)) +
geom_density(data = subset(f, name == "var_1")) +
geom_bar(data = subset(f, name != "var_1"), aes(fill = name)) +
facet_wrap(cluster~name, ncol = 3, scales = "free") +
scale_x_continuous(breaks = breaks, labels = labs) +
scale_fill_manual(values = c("deepskyblue4", "gold"), guide = guide_none())

Sample Image

Using ggplot2 facet grid to explore large dataset with continuous and categorical variables

Exploring our data is arguably the most interesting and intellectually challenging part of our research, so I encourage you to do some more reading into this topic.

Visualisation is of course important. @Parfait has suggested to shape your data long, which makes plotting easier. Your mix of continuous and categorical data is a bit tricky. Beginners often try very hard to avoid reshaping their data - but there is no need to fret! In the contrary, you will find that most questions require a specific shape of your data, and you will in most cases not find a "one fits all" shape.

So - the real challenge is how to shape your data before plotting. There are obviously many ways of doing this. Below one way, which should help "automatically" reshape columns that are continuous and those that are categorical. Comments in the code.

As a side note, when loading your data into R, I'd try to avoid storing categorical data as factors, and to convert to factors only when you need it. How to do this depends how you load your data. If it is from a csv, you can for example use read.csv('your.csv', stringsAsFactors = FALSE)

library(tidyverse)

``` r
# gathering numeric columns (without ID which is numeric).
# [I'd recommend against numeric IDs!!])
data_num <-
mydf %>%
select(-ID) %>%
pivot_longer(cols = which(sapply(., is.numeric)), names_to = 'key', values_to = 'value')

#No need to use facet here
ggplot(data_num) +
geom_boxplot(aes(key, value, color = group))

Sample Image

# selecting categorical columns is a bit more tricky in this example, 
# because your group is also categorical.
# One way:
# first convert all categorical columns to character,
# then turn your "group" into factor
# then gather the character columns:

# gathering numeric columns (without ID which is numeric).
# [I'd recommend against numeric IDs!!])

# I use simple count() and mutate() to create a summary data frame with the proportions and geom_col, which equals geom_bar('stat = identity')
# There may be neater ways, but this is pretty straight forward

data_cat <-
mydf %>% select(-ID) %>%
mutate_if(.predicate = is.factor, .funs = as.character) %>%
mutate(group = factor(group)) %>%
pivot_longer(cols = which(sapply(., is.character)), names_to = 'key', values_to = 'value')%>%
count(group, key, value) %>%
group_by(group, key) %>%
mutate(percent = n/ sum(n)) %>%
ungroup # I always 'ungroup' after my data manipulations, in order to avoid unexpected effects

ggplot(data_cat) +
geom_col(aes(group, percent, fill = key)) +
facet_grid(~ value)

Sample Image

Created on 2020-01-07 by the reprex package (v0.3.0)

Credit how to gather conditionally goes to this answer from @H1

ggplot: Generate facet grid plot with multiple series

One idea would be to create a new grouping variable:

x.df.melt$var <- ifelse(x.df.melt$variable == "x" | x.df.melt$variable == "y", "A", "B")

You can use it for facetting while using variable for grouping:

ggplot(x.df.melt, aes(Quarter, value, col=variable, group=variable)) + geom_line()+
facet_grid(var~., scale='free_y') +
scale_color_discrete(breaks=c('x','y','p','q'), guide = F)

Sample Image

Plotting each year as separate series using ggplot2 and faceting

I think what you're missing is a grouping by year. Assuming your data.frame is df,

require(ggplot2)
require(reshape2)
df1 <- read.csv("~/Downloads/testseries.csv")
df <- melt(df1,id=c("date"))
df$date <- as.Date(df$date)

# get `year` first
# df$year <- as.POSIXlt(df$date)$year + 1900 (old code)
# df$year <- format(df$date,'%Y') # following @agstudy's comment.
p <- ggplot(data = df, aes(x=date, y=value))
# group/colour by year
p <- p + geom_line(aes(colour=factor(year)))
p <- p + scale_colour_brewer(palette="Set3")
p <- p + facet_wrap(~ variable, scales="free", ncol=3)
p <- p + xlab("Date") + ylab("Discharge(cms)")
p

This gives:

ggplot2_facet

Edit 2: If this is not what you're looking for, then maybe you require facetting with 2 variables with facet_grid as follows:

df$year <- factor(as.POSIXlt(df$date)$year + 1900)
p <- ggplot(data = df, aes(x=date, y=value))
p <- p + geom_line()
p <- p + facet_grid(variable ~ year)
p <- p + xlab("Date") + ylab("Discharge(cms)")
p

Gives a dense graph:

ggplot2_facet_grid_dense

Is it possible to have 2 legends for variables when one is continuous and the other is discrete?

The easiest approach would be to map it to a different aesthetic than you already use:

library(ggplot2)

ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(colour = as.factor(gear), size = cyl)) +
geom_smooth(method = "loess", aes(linetype = "fit"))

Sample Image

There area also specialised packages for adding additional colour legends:

library(ggplot2)
library(ggnewscale)

ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(colour = as.factor(gear), size = cyl)) +
new_scale_colour() +
geom_smooth(method = "loess", aes(colour = "fit"))

Sample Image

Beware that if you want to tweak colours via a colourscale, you must first add these before calling the new_scale_colour(), i.e.:

ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(colour = as.factor(gear), size = cyl)) +
scale_colour_manual(values = c("red", "green", "blue")) +
new_scale_colour() +
geom_smooth(method = "loess", aes(colour = "fit")) +
scale_colour_manual(values = "purple")

EDIT: To adress comment: yes it is possible with a line that is data independent, I was just re-using the data for brevity of example. See below for arbitrary line (also should work with the ggnewscale approach):

ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(colour = as.factor(gear), size = cyl)) +
geom_line(data = data.frame(x = 1:30, y = rnorm(10, 200, 10)),
aes(x, y, linetype = "arbitrary line"))

Dates with month and day in time series plot in ggplot2 with facet for years

You are very close. You want the x-axis to be a measure of where in the year you are, but you have it as a character vector and so are getting every single point labelled. If you instead make a continuous variable represent this, you could have better results. One continuous variable would be the day of the year.

df$DayOfYear <- as.numeric(format(df$Date, "%j"))
ggplot(data = df,
mapping = aes(x = DayOfYear, y = Y, shape = Year, colour = Year)) +
geom_point() +
geom_line() +
facet_grid(facets = Year ~ .) +
theme_bw()

Sample Image

The axis could be formatted more date-like with an appropriate label function, but the breaks are still not being found in a very date-aware way. (And on top of that, there is an NA problem as well.)

ggplot(data = df,
mapping = aes(x = DayOfYear, y = Y, shape = Year, colour = Year)) +
geom_point() +
geom_line() +
facet_grid(facets = Year ~ .) +
scale_x_continuous(labels = function(x) format(as.Date(as.character(x), "%j"), "%d-%b")) +
theme_bw()

Sample Image

To get the goodness of nice date breaks, a different variable can be used. One that has the same day-of-the-year as the original data, but just one year. In this case, 2000 since it was a leap year. The problems with this have mostly to do with leap days, but if you don't care about that (March 1st of a non-leap year would align with February 29th of a leap year, etc.) you can use:

df$CommonDate <- as.Date(paste0("2000-",format(df$Date, "%j")), "%Y-%j")
ggplot(data = df,
mapping = aes(x = CommonDate, y = Y, shape = Year, colour = Year)) +
geom_point() +
geom_line() +
facet_grid(facets = Year ~ .) +
scale_x_date(labels = function(x) format(x, "%d-%b")) +
theme_bw()

Sample Image

ggplot facet different Y axis order based on value

The functions reorder_within and scale_*_reordered from the tidytext package might come in handy.

reorder_within recodes the values into a factor with strings in the form of "VARIABLE___WITHIN". This factor is ordered by the values in each group of WITHIN.
scale_*_reordered removes the "___WITHIN" suffix when plotting the axis labels.
Add scales = "free_y" in facet_wrap to make it work as expected.

Here is an example with generated data:

library(tidyverse)

# Generate data
df <- expand.grid(
year = 2019:2021,
group = paste("Group", toupper(letters[1:8]))
)
set.seed(123)
df$value <- rnorm(nrow(df), mean = 10, sd = 2)

df %>%
mutate(group = tidytext::reorder_within(group, value, within = year)) %>%
ggplot(aes(value, group)) +
geom_point() +
tidytext::scale_y_reordered() +
facet_wrap(vars(year), scales = "free_y")


Related Topics



Leave a reply



Submit