Plot Linear Regressions Lines Without Interaction in Ggplot2

plot linear regressions lines without interaction in ggplot2

Workaround would be to make model outside the ggplot(). Then make predicition for this model and add result to the original data frame. This will add columns fit, lwr and upr.

mod<-lm(mpg~factor(cyl)+hp,data=mtcars)
mtcars<-cbind(mtcars,predict(mod,interval="confidence"))

Now you can use geom_line() with fit values as y to add three regression lines and geom_ribbon() with lwr and upr to add confidence interval.

ggplot(mtcars, aes(hp, mpg, group = cyl)) + geom_point() +
geom_line(aes(y=fit))+geom_ribbon(aes(ymin=lwr,ymax=upr),alpha=0.4)

Sample Image

Trying to graph different linear regression models with ggplot and equation labels

If you are regressing Y on both X and Z, and these are both numerical variables (as they are in your example) then a simple linear regression represents a 2D plane in 3D space, not a line in 2D space. Adding an interaction term means that your regression represents a curved surface in a 3D space. This can be difficult to represent in a simple plot, though there are some ways to do it : the colored lines in the smoking / cycling example you show are slices through the regression plane at various (aribtrary) values of the Z variable, which is a reasonable way to display this type of model.

Although ggplot has some great shortcuts for plotting simple models, I find people often tie themselves in knots because they try to do all their modelling inside ggplot. The best thing to do when you have a more complex model to plot is work out what exactly you want to plot using the right tools for the job, then plot it with ggplot.

For example, if you make a prediction data frame for your interaction model:

model2 <- lm(Y ~ X * Z, data = hw_data)

predictions <- expand.grid(X = seq(min(hw_data$X), max(hw_data$X), length.out = 5),
Z = seq(min(hw_data$Z), max(hw_data$Z), length.out = 5))

predictions$Y <- predict(model2, newdata = predictions)

Then you can plot your interaction model very simply:

ggplot(hw_data, aes(X, Y)) + 
geom_point() +
geom_line(data = predictions, aes(color = factor(Z))) +
labs(color = "Z")

Sample Image

You can easily work out the formula from the coefficients table and stick it together with paste:

labs <- trimws(format(coef(model2), digits = 2))
form <- paste("Y =", labs[1], "+", labs[2], "* x +",
labs[3], "* Z + (", labs[4], " * X * Z)")
form
#> [1] "Y = -69.07 + 5.58 * x + 2.00 * Z + ( -0.13 * X * Z)"

This can be added as an annotation to your plot using geom_text or annotation


Update

A complete solution if you wanted to have only 3 levels for Z, effectively "high", "medium" and "low", you could do something like:

library(ggplot2)

model2 <- lm(Y ~ X * Z, data = hw_data)

predictions <- expand.grid(X = quantile(hw_data$X, c(0, 0.5, 1)),
Z = quantile(hw_data$Z, c(0.1, 0.5, 0.9)))

predictions$Y <- predict(model2, newdata = predictions)

labs <- trimws(format(coef(model2), digits = 2))
form <- paste("Y =", labs[1], "+", labs[2], "* x +",
labs[3], "* Z + (", labs[4], " * X * Z)")

form <- paste(form, " R\u00B2 =",
format(summary(model2)$r.squared, digits = 2))

ggplot(hw_data, aes(X, Y)) +
geom_point() +
geom_line(data = predictions, aes(color = factor(Z))) +
geom_text(x = 15, y = 25, label = form, check_overlap = TRUE,
fontface = "italic") +
labs(color = "Z")

Sample Image

How to plot a single regression line but colour points by a different factor in ggplot2 R?

If I undertand you correctly, you can assign group = 1 in the aes to plot just one regression line. You can use the following code:

library(tidyverse)
library(ggpmisc)
my.formula = y ~ x
ggplot(aes(x = x, y = y, color = z, group = 1), data = df) +
geom_point() + scale_fill_manual(values=c("purple", "blue")) +
geom_smooth(method="lm", formula = y ~ x ) +
stat_poly_eq(formula = my.formula, aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), parse = TRUE, size = 2.5, col = "black")+
theme_classic()

Output:

Sample Image

How to plot two independent linear regressions on the same plot in R using GGplot2?

Approach

Pivot to longer, use a group mapping to map pivoted group to lm

Code

  library(dplyr)
library(tidyr)
library(ggplot2)
df %>%
mutate(Bird.Plastic.Mass = as.numeric(trimws(Bird.Plastic.Mass)),
Year = factor(Year))%>%
na.omit() %>%
pivot_longer(cols = Bird.Plastic.Mass:Signy.Plastic.Mass, names_to = "var", values_to="val") %>%
ggplot(aes(Year, val, col=var, group=var))+
geom_point() +

geom_smooth(method="lm")

Result (not exactly as Excel plot, may be due to less data)

Sample Image

Data

df <- structure(list(Year = c("1     1991 ", "2     1992 ", "3     1993 ", 
"4 1994 ", "5 1995 ", "6 1996 ", "7 1997 ", "8 1998 ",
"9 1999 ", "10 2000 ", "11 2001 ", "12 2002 ", "13 2003 ",
"14 2004 ", "15 2005 ", "16 2006 ", "17 2007 ", "18 2008 ",
"19 2009 ", "20 2010 ", "21 2011 ", "22 2012 ", "23 2013 ",
"24 2014 ", "25 2015 ", "26 2016 ", "27 2017 ", "28 2018 ",
"29 2019 "), Bird.Plastic.Mass = c(" NA ", " NA ",
" NA ", " NA ", " NA ",
" 6.43 ", " 19.86", " 4.89 ",
" 2.97 ", " 3.10 ", " 3.30 ",
" 4.45 ", " 4.05 ", " 2.18 ",
" 4.88 ", " 4.39 ", " 4.27 ",
" 4.40 ", " 1.63 ", " 1.70 ",
" 1.64 ", " 2.16 ", " 3.05 ",
" 1.34 ", " 3.66 ", " 0.87 ",
" 1.10 ", " 2.29 ", " 1.44 "
), Signy.Plastic.Mass = c(2.384, 8.34, 2.68, 1.45, 1.94, 0.57,
1.17, 2.01, 1.41, 1.69, 0.35, 9.28, 16.75, 4.33, 0.26, 13.5,
6.27, 9.03, 3.86, 22.1, 1.15, 13.08, 0.14, 0.01, 0, 0, 7.01,
1.74, 80.79)), class = "data.frame", row.names = c(NA, -29L))

Adding a regression line on a ggplot

In general, to provide your own formula you should use arguments x and y that will correspond to values you provided in ggplot() - in this case x will be interpreted as x.plot and y as y.plot. You can find more information about smoothing methods and formula via the help page of function stat_smooth() as it is the default stat used by geom_smooth().

ggplot(data,aes(x.plot, y.plot)) +
stat_summary(fun.data=mean_cl_normal) +
geom_smooth(method='lm', formula= y~x)

If you are using the same x and y values that you supplied in the ggplot() call and need to plot the linear regression line then you don't need to use the formula inside geom_smooth(), just supply the method="lm".

ggplot(data,aes(x.plot, y.plot)) +
stat_summary(fun.data= mean_cl_normal) +
geom_smooth(method='lm')

How do I plot two regression lines on the same plot with different x and y?

I think I may have a different set of data than you, but the principle is the same. Let's run a linear regression of son's heights on father's heights, then repeat it vice-versa

father_x <- lm(son ~ father, data = galton_heights)
son_x <- lm(father ~ son, data = galton_heights)

coef(father_x)
#> (Intercept) father
#> 33.886604 0.514093

coef(son_x)
#> (Intercept) son
#> 34.10745 0.48890

Now, obviously the coefficients are different. The formula for son's heights based on father's heights is:

son = 0.514093 * father + 33.886604

But if we take the other regression, we can rearrange it to solve for son's heights based on fathers' heights too:

father = 0.48890 * son + 34.10745

son = (father - 34.10745)/0.48890

son = 2.045408 * father - 69.76365

This gives us plotting coefficients for our two lines:

ggplot(galton_heights, aes(x = father, y = son)) +
geom_point() +
geom_abline(aes(slope = 0.514093, intercept = 33.886604,
colour = "son height regressed\non father height"),
size = 2) +
geom_abline(aes(slope = 2.045408, intercept = -69.76365,
color = "father height regressed\non son height"),
size = 2) +
theme_bw()

Sample Image

Notice the symmetry when we flip co-ordinates:

ggplot(galton_heights, aes(x = father, y = son)) +
geom_point() +
geom_abline(aes(slope = 0.514093, intercept = 33.886604,
colour = "son height regressed\non father height"),
size = 2) +
geom_abline(aes(slope = 2.045408, intercept = -69.76365,
color = "father height regressed\non son height"),
size = 2) +
theme_bw() +
coord_flip()

Sample Image

Created on 2022-02-12 by the reprex package (v2.0.1)



Related Topics



Leave a reply



Submit