Q-Q Plot with Ggplot2::Stat_Qq, Colours, Single Group

Q-Q plot with ggplot2::stat_qq, colours, single group

You could calculate the quantiles yourself and then plot using geom_point:

dda = cbind(dda, setNames(qqnorm(dda$.resid, plot.it=FALSE), c("Theoretical", "Sample")))

ggplot(dda) +
geom_point(aes(x=Theoretical, y=Sample, colour=f))

Sample Image

Ah, I guess I should have read to the end of your question. This is the manual solution you were referring to, right? Although you could just package it as a function:

my_stat_qq = function(data, colour.var) {

data=cbind(data, setNames(qqnorm(data$.resid, plot.it=FALSE), c("Theoretical", "Sample")))

ggplot(data) +
geom_point(aes_string(x="Theoretical", y="Sample", colour=colour.var))

}

my_stat_qq(dda, "f")

Q-Q plot with ggplot2::stat_qq, colours, multiple groups with Q-Q lines

Turning the code to calculate the qqlines into a function and then using lapply to create a separate data.frame for your qqlines is one approach.

library(dplyr)
library(ggplot2)
library(broom) ## for augment()

set.seed(1001)
N <- 1000
G <- 3
dd <- data_frame(x = runif(N),
group = factor(sample(LETTERS[1:G], size=N, replace=TRUE)),
y = rnorm(N) + 2*x + as.numeric(group))
m1 <- lm(y~x, data=dd)
dda <- cbind(augment(m1), group=dd$group)
sample_var <- "y"
group_var <- "group"

# code to compute the slope and the intercept of the qq-line

qqlines <- function(vec, group) {
x <- qnorm(c(0.25, 0.75))
y <- quantile(vec[!is.na(vec)], c(0.25, 0.75))
slope <- diff(y)/diff(x)
int <- y[1] - slope * x[1]
data.frame(slope, int, group)
}

slopedf <- do.call(rbind,lapply(unique(dda$group), function(grp) qqlines(dda[dda$group == grp,sample_var], grp)))

# now plot with ggplot2
p <- ggplot(dda)+stat_qq(aes_string(sample=sample_var, colour=group_var)) +
geom_abline(data = slopedf, aes(slope = slope, intercept = int, colour = group))
p

Sample Image

How to colour points on QQ plot in ggplot

The issue is that we can not color the points via the color aes as this splits the data into groups, i.e. we get a qq-plot for each group instead of for the whole sample.

To overcome this problem my approach colors the points via the color argument. This however requires some manual work. First. Set up the color vector according to the test$y. Second, to get the right colors in the plot we have to order the color vector by test$x. Hard work but at least it works. (; Try this:

library(ggplot2)

set.seed(1967)
test <- as.data.frame(cbind( x = rnorm(100,5,2),y=c(rep(1,30),rep(2,70))))

# Vector of colors
col <- ifelse(test$y == 1, scales::hue_pal()(2)[1], scales::hue_pal()(2)[2])
# Order by x
col <- col[order(test$x)]

ggplot(test) +
geom_qq(aes(sample=x), col=col)

Sample Image

PS: Thanks to @Tjebo for checking and pointing out that I don't need geom_point to get the right colors.

Multiple qqplots on one gragh and single abline ggplot2 R

Straightforward in ggplot2 with stat_qq and reshaping your data from wide to long.

library(tidyverse)

set.seed(10)
dat <- data.frame(Observed = rnorm(20), sim1= rnorm(20), sim2 = rnorm(20),sim3 = rnorm(20),sim4 = rnorm(20),sim5 = rnorm(20),sim6 = rnorm(20))

plot <- dat %>%
gather(variable, value, -Observed) %>%
ggplot(aes(sample = value, color = variable)) +
geom_abline() +
stat_qq()

# All in one
plot

Sample Image

It might be beneficial if you look at making facets or small multiples along your comparison variable.

# Facets!
plot +
facet_wrap(~variable)

Sample Image

If you're looking to provide your own observed, then rather than being fancy, let qqplot do the heavy lifting but set plot.it = FALSE and it will return you a list of x/y coords for the qq plot. A little iteration with purrr::map_dfr, and you can do:

library(tidyverse)

set.seed(10)
dat <- data.frame(Observed = rnorm(20), sim1 = rnorm(20), sim2 = rnorm(20),sim3 = rnorm(20),sim4 = rnorm(20),sim5 = rnorm(20),sim6 = rnorm(20))

plot_data <- map_dfr(names(dat)[-1], ~as_tibble(qqplot(dat[[.x]], dat$Observed, plot.it = FALSE)) %>%
mutate(id = .x))

ggplot(plot_data, aes(x, y, color = id)) +
geom_point() +
geom_abline() +
facet_wrap(~id)

Sample Image

Created on 2018-11-25 by the reprex package (v0.2.1)

Quantile-Quantile plot using two vectors with ggplot

Is this what you need?

ggplot() + geom_point(data=df, aes(x=sort(Obs), y=sort(Model))) + xlab('Obs') + ylab('Model')

Or maybe this...

df.ord = do.call('rbind', lapply(split(df, df$type), function(.d) transform(.d, Obs = sort(Obs), Model = sort(Model))))
ggplot() + geom_point(data=df.ord, aes(x=Obs, y=Model, col=type)) + facet_grid(~type)

Coloring points in a geom_qq plot

The geom_qq doesn't seem to be able to allow this. In theory, if you could change this line from

data.frame(sample, theoretical)

to

data.frame(sample, theoretical, data)

it would probably work, but it's not obvious to me the easiest way to attempt that.

Instead I recommend you just calculate the values yourself. It's pretty simple. you can use a function like this

make_qq <- function(dd, x) {
dd<-dd[order(dd[[x]]), ]
dd$qq <- qnorm(ppoints(nrow(dd)))
dd
}

And then you can make the plot like this

ggplot(make_qq(mtcars, "mpg")) + 
geom_point(aes(x=qq, y=mpg, color=hp)) +
labs(x="Theoretical",y="Observed")

Sample Image



Related Topics



Leave a reply



Submit