﻿ Plotting Continuous and Discrete Series in Ggplot with Facet - ITCodar

# Plotting Continuous and Discrete Series in Ggplot with Facet

## Plotting continuous and discrete series in ggplot with facet

Problem with your data is that that for data frame `subm` `value` is numeric (continuous) but for the `mcsm` `value` is factor (discrete). You can't use the same scale for numeric and continuous values and you get y values only for the last facet (discrete). Also it is not possible to use two `scale_y...()` functions in one plot.

My approach would be to make `mcsm` `value` as numeric (saved as `value2`) and then use them - it will plot quarters as 1,2,3 and 4. To solve the problem with legend, use `scale_color_discrete()` and provide `breaks=` in order you need.

``mcsm\$value2<-as.numeric(mcsm\$value)ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+ facet_grid(variable~., scale='free_y') + geom_step(data=mcsm, aes(date, value2)) +  scale_color_discrete(breaks=c('psavert','uempmed','unemploy','q'))``

#### UPDATE - solution using grobs

Another approach is to use grobs and library `gridExtra` to plot your data as separate plots.

First, save plot with all legends and data (code as above) as object `p`. Then with functions `ggplot_build()` and `ggplot_gtable()` save plot as grob object `gp`. Extract from gp only part that plots legend (saved as object `gp.leg`) - in this case is list element number 17.

``library(gridExtra)p<-ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+  facet_grid(variable~., scale='free_y') + geom_step(data=mcsm, aes(date, value2)) +  scale_color_discrete(breaks=c('psavert','uempmed','unemploy','q'))gp<-ggplot_gtable(ggplot_build(p))gp.leg<-gp\$grobs[[17]]``

Make two new plot `p1` and `p2` - first plots data of `subm` and second only data of `mcsm`. Use `scale_color_manual()` to set colors the same as used for plot `p`. For the first plot remove x axis title, texts and ticks and with `plot.margin=` set lower margin to negative number. For the second plot change upper margin to negative number. `faced_grid()` should be used for both plots to get faceted look.

``p1 <- ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+   facet_grid(variable~., scale='free_y')+  theme(plot.margin = unit(c(0.5,0.5,-0.25,0.5), "lines"),        axis.text.x=element_blank(),        axis.title.x=element_blank(),        axis.ticks.x=element_blank())+  scale_color_manual(values=c("#F8766D","#00BFC4","#C77CFF"),guide="none")p2 <- ggplot(data=mcsm, aes(date, value,group=1,col=variable)) + geom_step() +  facet_grid(variable~., scale='free_y')+  theme(plot.margin = unit(c(-0.25,0.5,0.5,0.5), "lines"))+ylab("")+  scale_color_manual(values="#7CAE00",guide="none")``

Save both plots `p1` and `p2` as grob objects and then set for both plots the same widths.

``gp1 <- ggplot_gtable(ggplot_build(p1))gp2 <- ggplot_gtable(ggplot_build(p2))maxWidth = grid::unit.pmax(gp1\$widths[2:3],gp2\$widths[2:3])gp1\$widths[2:3] <- as.list(maxWidth)gp2\$widths[2:3] <- as.list(maxWidth)``

With functions `grid.arrange()` and `arrangeGrob()` arrange both plots and legend in one plot.

``grid.arrange(arrangeGrob(arrangeGrob(gp1,gp2,heights=c(3/4,1/4),ncol=1),       gp.leg,widths=c(7/8,1/8),ncol=2))``

## Using ggplot2 and facet_grid for continuous and categorical variables together (R)

This is possible to do entirely within ggplot, but it's pretty hacky. Facets are really a way of showing extra dimensions of the same data set. They are not intended to be a way of arbitrarily stitching different plots together, so an entirely ggplot-based solution requires manipulating your data and the axis labels to produce the appearance of stitching plots together.

First, we get the unique levels of the barplot variables as character strings:

``levs    <- sort(unique(c(as.character(f\$var_2), as.character(f\$var_3))))``

Now, we convert the factors to numbers:

``f\$var_2 <- as.numeric(factor(f\$var_2, levs)) + ceiling(max(f\$var_1)) + 10f\$var_3 <- as.numeric(factor(f\$var_3, levs)) + ceiling(max(f\$var_1)) + 10``

We will now construct the breaks and labels that we will use for our x axis

``breaks  <- c(pretty(range(f\$var_1)), sort(unique(c(f\$var_2, f\$var_3))))labs    <- c(pretty(range(f\$var_1)), levs)``

Now we can safely pivot our data frame:

``f <- pivot_longer(f, cols = c("var_1", "var_2", "var_3")) ``

For our plot, we will use appropriately subsetted groups from the data frame for the density plot and the bar plots. We then facet with free scales and label the x axis with our pre-defined breaks and labels:

``ggplot(f, aes(x = value)) +  geom_density(data = subset(f, name == "var_1")) +  geom_bar(data = subset(f, name != "var_1"), aes(fill = name)) +  facet_wrap(cluster~name, ncol = 3, scales = "free") +  scale_x_continuous(breaks = breaks, labels = labs) +  scale_fill_manual(values = c("deepskyblue4", "gold"), guide = guide_none())``

## Using ggplot2 facet grid to explore large dataset with continuous and categorical variables

Exploring our data is arguably the most interesting and intellectually challenging part of our research, so I encourage you to do some more reading into this topic.

Visualisation is of course important. @Parfait has suggested to shape your data long, which makes plotting easier. Your mix of continuous and categorical data is a bit tricky. Beginners often try very hard to avoid reshaping their data - but there is no need to fret! In the contrary, you will find that most questions require a specific shape of your data, and you will in most cases not find a "one fits all" shape.

So - the real challenge is how to shape your data before plotting. There are obviously many ways of doing this. Below one way, which should help "automatically" reshape columns that are continuous and those that are categorical. Comments in the code.

As a side note, when loading your data into R, I'd try to avoid storing categorical data as factors, and to convert to factors only when you need it. How to do this depends how you load your data. If it is from a csv, you can for example use `read.csv('your.csv', stringsAsFactors = FALSE)`

``library(tidyverse)``` r# gathering numeric columns (without ID which is numeric).#  [I'd recommend against numeric IDs!!])data_num <-   mydf %>%   select(-ID) %>%   pivot_longer(cols = which(sapply(., is.numeric)), names_to = 'key', values_to =  'value')#No need to use facet hereggplot(data_num) +  geom_boxplot(aes(key, value, color = group))``

``# selecting categorical columns is a bit more tricky in this example, # because your group is also categorical. # One way:# first convert all categorical columns to character, # then turn your "group" into factor# then gather the character columns: # gathering numeric columns (without ID which is numeric).#  [I'd recommend against numeric IDs!!])# I use simple count() and mutate() to create a summary data frame with the proportions and geom_col, which equals geom_bar('stat = identity')# There may be neater ways, but this is pretty straight forward data_cat <-   mydf %>% select(-ID) %>%  mutate_if(.predicate = is.factor, .funs = as.character) %>%  mutate(group = factor(group)) %>%  pivot_longer(cols = which(sapply(., is.character)), names_to = 'key', values_to =  'value')%>%  count(group, key, value) %>%  group_by(group, key) %>%  mutate(percent =  n/ sum(n)) %>%  ungroup # I always 'ungroup' after my data manipulations, in order to avoid unexpected effectsggplot(data_cat) +  geom_col(aes(group, percent, fill = key)) +  facet_grid(~ value)``

Created on 2020-01-07 by the reprex package (v0.3.0)

Credit how to gather conditionally goes to this answer from @H1

## ggplot: Generate facet grid plot with multiple series

One idea would be to create a new grouping variable:

``x.df.melt\$var <- ifelse(x.df.melt\$variable == "x" | x.df.melt\$variable == "y", "A", "B")``

You can use it for facetting while using `variable` for grouping:

``ggplot(x.df.melt, aes(Quarter, value, col=variable, group=variable)) + geom_line()+  facet_grid(var~., scale='free_y') +  scale_color_discrete(breaks=c('x','y','p','q'), guide = F)``

## Plotting each year as separate series using ggplot2 and faceting

I think what you're missing is a grouping by `year`. Assuming your `data.frame` is `df`,

``require(ggplot2)require(reshape2)df1 <- read.csv("~/Downloads/testseries.csv")df <- melt(df1,id=c("date"))df\$date <- as.Date(df\$date)# get `year` first# df\$year <- as.POSIXlt(df\$date)\$year + 1900 (old code)# df\$year <- format(df\$date,'%Y') # following @agstudy's comment.p <- ggplot(data = df, aes(x=date, y=value))# group/colour by yearp <- p + geom_line(aes(colour=factor(year))) p <- p + scale_colour_brewer(palette="Set3") p <- p + facet_wrap(~ variable, scales="free", ncol=3)p <- p + xlab("Date") + ylab("Discharge(cms)")p``

This gives:

Edit 2: If this is not what you're looking for, then maybe you require facetting with 2 variables with `facet_grid` as follows:

``df\$year <- factor(as.POSIXlt(df\$date)\$year + 1900)p <- ggplot(data = df, aes(x=date, y=value))p <- p + geom_line() p <- p + facet_grid(variable ~ year)p <- p + xlab("Date") + ylab("Discharge(cms)")p``

Gives a dense graph:

## Is it possible to have 2 legends for variables when one is continuous and the other is discrete?

The easiest approach would be to map it to a different aesthetic than you already use:

``library(ggplot2)ggplot(mtcars, aes(x = mpg, y = hp)) +  geom_point(aes(colour = as.factor(gear), size = cyl)) +  geom_smooth(method = "loess", aes(linetype = "fit"))``

``library(ggplot2)library(ggnewscale)ggplot(mtcars, aes(x = mpg, y = hp)) +  geom_point(aes(colour = as.factor(gear), size = cyl)) +  new_scale_colour() +  geom_smooth(method = "loess", aes(colour = "fit"))``

Beware that if you want to tweak colours via a colourscale, you must first add these before calling the `new_scale_colour()`, i.e.:

``ggplot(mtcars, aes(x = mpg, y = hp)) +  geom_point(aes(colour = as.factor(gear), size = cyl)) +  scale_colour_manual(values = c("red", "green", "blue")) +  new_scale_colour() +  geom_smooth(method = "loess", aes(colour = "fit")) +  scale_colour_manual(values = "purple")``

EDIT: To adress comment: yes it is possible with a line that is data independent, I was just re-using the data for brevity of example. See below for arbitrary line (also should work with the ggnewscale approach):

``ggplot(mtcars, aes(x = mpg, y = hp)) +  geom_point(aes(colour = as.factor(gear), size = cyl)) +  geom_line(data = data.frame(x = 1:30, y = rnorm(10, 200, 10)),            aes(x, y, linetype = "arbitrary line"))``

## Dates with month and day in time series plot in ggplot2 with facet for years

You are very close. You want the x-axis to be a measure of where in the year you are, but you have it as a character vector and so are getting every single point labelled. If you instead make a continuous variable represent this, you could have better results. One continuous variable would be the day of the year.

``df\$DayOfYear <- as.numeric(format(df\$Date, "%j"))ggplot(data = df,       mapping = aes(x = DayOfYear, y = Y, shape = Year, colour = Year)) +  geom_point() +  geom_line() +  facet_grid(facets = Year ~ .) +  theme_bw()``

The axis could be formatted more date-like with an appropriate label function, but the breaks are still not being found in a very date-aware way. (And on top of that, there is an `NA` problem as well.)

``ggplot(data = df,       mapping = aes(x = DayOfYear, y = Y, shape = Year, colour = Year)) +  geom_point() +  geom_line() +  facet_grid(facets = Year ~ .) +  scale_x_continuous(labels = function(x) format(as.Date(as.character(x), "%j"), "%d-%b")) +  theme_bw()``

To get the goodness of nice date breaks, a different variable can be used. One that has the same day-of-the-year as the original data, but just one year. In this case, 2000 since it was a leap year. The problems with this have mostly to do with leap days, but if you don't care about that (March 1st of a non-leap year would align with February 29th of a leap year, etc.) you can use:

``df\$CommonDate <- as.Date(paste0("2000-",format(df\$Date, "%j")), "%Y-%j")ggplot(data = df,       mapping = aes(x = CommonDate, y = Y, shape = Year, colour = Year)) +  geom_point() +  geom_line() +  facet_grid(facets = Year ~ .) +  scale_x_date(labels = function(x) format(x, "%d-%b")) +  theme_bw()``

## ggplot facet different Y axis order based on value

The functions `reorder_within` and `scale_*_reordered` from the tidytext package might come in handy.

`reorder_within` recodes the values into a factor with strings in the form of "VARIABLE___WITHIN". This factor is ordered by the values in each group of WITHIN.
`scale_*_reordered` removes the "___WITHIN" suffix when plotting the axis labels.
Add `scales = "free_y"` in `facet_wrap` to make it work as expected.

Here is an example with generated data:

``library(tidyverse)# Generate datadf <- expand.grid(  year = 2019:2021,  group = paste("Group", toupper(letters[1:8])))set.seed(123)df\$value <- rnorm(nrow(df), mean = 10, sd = 2)df %>%   mutate(group = tidytext::reorder_within(group, value, within = year)) %>%   ggplot(aes(value, group)) +  geom_point() +  tidytext::scale_y_reordered() +  facet_wrap(vars(year), scales = "free_y")``