ggplot2: geom_smooth confidence band does not extend to edge of graph, even with fullrange=TRUE
You probably need to add coord_cartesian
in addition to scale_x/y_continuous
. scale_x/y_continuous
removes points that are outside the range of the graph, but coord_cartesian
overrides this and uses all of the data, even if some of it is not visible in the graph. In your plot, the confidence band for the red points ends where the top of the band exceeds the y-range of the graph.
There's no actual "data" in the extended range of your graph, but geom_smooth
treats the points it generates for plotting the confidence bands as "data" for the purposes of deciding what to plot.
Take a look at the examples below. The first plot uses only scale_x/y_continuous
. The second adds coord_cartesian
, but note that the confidence bands are still not plotted. In the third plot, we still use coord_cartesian
, but we expand the scale_y_continuous
range downward so that points in the confidence band below zero are included in the y-range. However, coord_cartesian
is what determines the range that's actually plotted and also prevents points outside the range from being excluded.
I actually find this behavior confusing. I would have thought that you could just use coord_cartesian
alone with the desired x and y ranges and still have the confidence bands and regression lines plotted all the way to the edges of the graph. In any case, hopefully this will get you what you're looking for.
p1 = ggplot(mtcars, aes(wt, mpg, colour=factor(am))) +
geom_smooth(fullrange=TRUE, method="lm") +
scale_x_continuous(expand=c(0,0), limits=c(0,10)) +
scale_y_continuous(expand=c(0,0), limits=c(0,100)) +
ggtitle("scale_x/y_continuous")
p2 = ggplot(mtcars, aes(wt, mpg, colour=factor(am))) +
geom_smooth(fullrange=TRUE, method="lm") +
scale_x_continuous(expand=c(0,0), limits=c(0,10)) +
scale_y_continuous(expand=c(0,0), limits=c(0,100)) +
coord_cartesian(xlim=c(0,10), ylim=c(0,100)) +
ggtitle("Add coord_cartesian; same y-range")
p3 = ggplot(mtcars, aes(wt, mpg, colour=factor(am))) +
geom_smooth(fullrange=TRUE, method="lm") +
scale_x_continuous(expand=c(0,0), limits=c(0,10)) +
scale_y_continuous(expand=c(0,0), limits=c(-50,100)) +
coord_cartesian(xlim=c(0,10), ylim=c(0,100)) +
ggtitle("Add coord_cartesian; expanded y-range")
gridExtra::grid.arrange(p1, p2, p3)
ggplot: geom_smooth() lines don't extend to left edge at x=0 with log10 transformation
Try setting expand
argument in y and x scale:
scale_x_continuous(trans='log10', expand = c(0, 0)) +
scale_y_continuous(expand = c(0, 0))
geom_smooth is not spanning the whole range of data
Your problem is with geom_jitter
. Looking at the mpg
dataset it appears there are only two years, 1999 and 2008. geom_jitter
is making the range appear to be much wider than and it, but geom_smooth
only draws a line in the range of the data. For example, using
ggplot(mpg, aes(year, cty)) + geom_point() + geom_smooth(method = "lm", se = TRUE, span=3, fullrange=TRUE)
gives us a plot like this instead
geom_jitter
is jittering not just the y values (cty) but also the x values (year) which makes it appear as though the date range of the data is wider than it actually is. Since geom_smooth
only interpolate inside the range, it doesn't span the whole plot like you want.
geom_smooth moves based on scale_y_continuous limits
Limits in scales() first set the values outside of the limits to missing and then calculates the geom.
Limits in coords() first calculates the geoms and then plots only the information within the limits.
See http://rpubs.com/INBOstats/zoom_in for some reproducible examples.
fullrange = TRUE ignored in stat_smooth
You have to add + xlim(0,200)
!
extend geom_smooth in a single direction
In the internal workings of stat_smooth
, predictdf
is called to create the smoothed line. The difficulty here is : This is an S3 method not exported. It also don't take ... parameters so it is really difficult to extend it.
Here the idea is to create a new dummy classes lm_right
and lm_left
where we call the default lm
method.
## decorate lm object with a new class lm_right
lm_right <- function(formula,data,...){
mod <- lm(formula,data)
class(mod) <- c('lm_right',class(mod))
mod
}
## decorate lm object with a new class lm_left
lm_left <- function(formula,data,...){
mod <- lm(formula,data)
class(mod) <- c('lm_left',class(mod))
mod
}
Then for each method we create a predict_df
specialization where we truncate the x values in the opposite side.
predictdf.lm_right <-
function(model, xseq, se, level){
## here the main code: truncate to x values at the right
init_range = range(model$model$x)
xseq <- xseq[xseq >=init_range[1]]
ggplot2:::predictdf.default(model, xseq[-length(xseq)], se, level)
}
Same thing for the left extension :
predictdf.lm_left <-
function(model, xseq, se, level){
init_range = range(model$model$x)
## here the main code: truncate to x values at the left
xseq <- xseq[xseq <=init_range[2]]
ggplot2:::predictdf.default(model, xseq[-length(xseq)], se, level)
}
Finally a using example:
library(ggplot2)
library(gridExtra)
## you should set the fullrange option to a true
p1 <- ggplot(mtcars, aes(y=wt, x=mpg)) + xlim(0,50) + geom_point() +
stat_smooth(method="lm_left", fullrange=TRUE,col='green')
p2 <- ggplot(mtcars, aes(y=wt, x=mpg)) + xlim(0,50) + geom_point() +
stat_smooth(method="lm_right", fullrange=TRUE,col='red')
grid.arrange(p1,p2)
How to prevent line to extend across whole graph
You could use geom_segment
instead of geom_abline
if you want to manually define the line. If your slope is derived from the dataset you are plotting from, the easiest thing to do is use stat_smooth
with method = "lm"
.
Here is an example with some toy data:
set.seed(16)
x = runif(100, 1, 9)
y = -8.3 + (1/1.415)*x + rnorm(100)
dat = data.frame(x, y)
Estimate intercept and slope:
coef(lm(y~x))
(Intercept) x
-8.3218990 0.7036189
First make the plot with geom_abline
for comparison:
ggplot(dat, aes(x, y)) +
geom_point() +
geom_abline(intercept = -8.32, slope = 0.704) +
xlim(1, 9)
Using geom_segment
instead, have to define the start and end of the line for both x
and y
. Make sure line is truncated between 1 and 9 on the x axis.
ggplot(dat, aes(x, y)) +
geom_point() +
geom_segment(aes(x = 1, xend = 9, y = -8.32 + .704, yend = -8.32 + .704*9)) +
xlim(1, 9)
Using stat_smooth
. This will draw the line only within the range of the explanatory variable by default.
ggplot(dat, aes(x, y)) +
geom_point() +
stat_smooth(method = "lm", se = FALSE, color = "black") +
xlim(1, 9)
R : confidence interval being partially displayed with ggplot2 (using geom_smooth())
For the first three segments of the confidence interval, the top end of the range is at least partially out of bounds (the bounds being [-1, 1], not the slightly expanded range on the axes). ggplot
's default behavior is to not display any object that is partially out of bounds. You can fix this by adding oob=scales::rescale_none
to scale_y_continuous
:
library(scales)
graph <- ggplot(df.m, aes(group=1,disciplines,value,colour=variable,shape=variable)) +
geom_point() +
geom_smooth(stat="smooth", method=loess, level=0.95) +
scale_x_discrete(name="Disciplines") +
scale_y_continuous(limits=c(-1,1), name="Measurement", oob=rescale_none)
Related Topics
Twitter Throws Forbidden Error After Entering Twitter API Pin
Writing a Function to Calculate the Mean of Columns in a Dataframe in R
Rvest Not Recognizing CSS Selector
How to Create a Single Dummy Variable with Conditions in Multiple Columns
Changing Names in a List of Dataframes
Drawing Journey Path Using Leaflet in R
R: Get Element by Name from a Nested List
Terms of a Sum in a R Expression
Selecting Multiple Columns in Data Frame Using Partial Column Name
How to Pop Up the Graphics Window from Rscript
Function for Polynomials of Arbitrary Order (Symbolic Method Preferred)
Data.Table: Sum by All Existing Combinations in Table
Changing Line Color in Ggplot Based on Slope
How to Change Gender Factor into an Numerical Coding in R
Adding an Image to a Datatable in R
R: Calculate the Number of Occurrences of a Specific Event in a Specified Time Future