How to plot a contour line showing where 95% of values fall within, in R and in ggplot2
Unfortunately, the accepted answer currently fails with Error: Unknown parameters: breaks
on ggplot2 2.1.0
. I cobbled together an alternative approach based on the code in this answer, which uses the ks
package for computing the kernel density estimate:
library(ggplot2)
set.seed(1001)
d <- data.frame(x=rnorm(1000),y=rnorm(1000))
kd <- ks::kde(d, compute.cont=TRUE)
contour_95 <- with(kd, contourLines(x=eval.points[[1]], y=eval.points[[2]],
z=estimate, levels=cont["5%"])[[1]])
contour_95 <- data.frame(contour_95)
ggplot(data=d, aes(x, y)) +
geom_point() +
geom_path(aes(x, y), data=contour_95) +
theme_bw()
Here's the result:
TIP: The ks
package depends on the rgl
package, which can be a pain to compile manually. Even if you're on Linux, it's much easier to get a precompiled version, e.g. sudo apt install r-cran-rgl
on Ubuntu if you have the appropriate CRAN repositories set up.
Contour levels corresponding to variable levels in ggplot2
The ultimate solution to this way to use the akima
package for interpolation then the ggplot2
for final plotting. This is the method I used:
library(ggplot2)
library(akima)
library(dplyr)
interpdf <-interp2xyz(interp(x=iris$Petal.Width, y=iris$Petal.Length, z=iris$Sepal.Width, duplicate="mean"), data.frame=TRUE)
interpdf %>%
filter(!is.na(z)) %>%
tbl_df() %>%
ggplot(aes(x = x, y = y, z = z, fill = z)) +
geom_tile() +
geom_contour(color = "white", alpha = 0.05) +
scale_fill_distiller(palette="Spectral", na.value="white") +
theme_bw()
Plot a point in a contour plot ggplot2
You can provide the points directly to geom_point()
:
set.seed(1000)
x = rnorm(1000)
g = ggplot(as.data.frame(x), aes(x = x))
g + stat_bin() + geom_point(data = data.frame(x = -1, y = 40), aes(x=x,y=y))
Plot a point in a contour plot ggplot2
You can provide the points directly to geom_point()
:
set.seed(1000)
x = rnorm(1000)
g = ggplot(as.data.frame(x), aes(x = x))
g + stat_bin() + geom_point(data = data.frame(x = -1, y = 40), aes(x=x,y=y))
Create %-contour in a 3d kernel density and find which points are within that contour
Rather than trying to find which points are within a contour, I would try to evaluate the density at each point, and colour the points according to how that value compares to the level of the contour. It might come to a different decision for a few points near the boundary, but should be pretty close.
To do that evaluation, you could use the oce::approx3d
function on the density estimate.
The other thing I'd do is to choose the contour based on the quantiles of the observed densities, rather than trying to simulate a 3-d integral of the estimated density.
Here's code to do all of that:
library(MASS)
library(misc3d)
library(rgl)
library(oce)
#> Loading required package: testthat
#> Loading required package: gsw
# Create dataset
set.seed(42)
Sigma <- matrix(c(15, 8, 5, 8, 15, .2, 5, .2, 15), 3, 3)
mv <- data.frame(mvrnorm(400, c(100, 100, 100),Sigma))
### 3d ###
# Create kernel density
dens3d <- kde3d(mv[,1], mv[,2], mv[,3], n = 40)
# Find the estimated density at each observed point
datadensity <- approx3d(dens3d$x, dens3d$y, dens3d$z, dens3d$d,
mv[,1], mv[,2], mv[,3])
# Find the contours
prob <- .5
levels <- quantile(datadensity, probs = prob, na.rm = TRUE)
# Plot it
colours <- c("gray", "orange")
cuts <- cut(datadensity, c(0, levels, Inf))
for (i in seq_along(levels(cuts))) {
gp <- as.numeric(cuts) == i
spheres3d(mv[gp,1], mv[gp,2], mv[gp,3], col = colours[i], radius = 0.2)
}
box3d(col = "gray")
contour3d(dens3d$d, level = levels, x = dens3d$x, y = dens3d$y, z = dens3d$z, #exp(-12)
alpha = .1, color = "red", color2 = "gray", add = TRUE)
title3d(xlab = "x", ylab = "y", zlab = "z")
And here is the plot that was produced:
Related Topics
How to Get a Warning on "Shiny App Will Not Work If the Same Output Is Used Twice"
Convert Comma Separated String to Integer in R
How to Remove Na from Facet_Wrap in Ggplot2
Geom_Col Is Assigning the Wrong Independent Variable
Use Fortran Subroutine in R? Undefined Symbol
R Creating a Sequence Table from Two Columns
Combining Pivoted Rows in R by Common Value
How to Find Index of Match Between Two Set of Data Frame
Display Y-Axis for Each Subplot When Faceting
Get All the Rows with Rownames Starting with Abc111
Subtracting Values Group-Wise by the Average of Each Group in R
Shiny Dynamic Filter Variable Selection and Display of Variable Values for Selection
How to Extract All the Rows If a Level in One Column Contains All the Levels of Another Column in R
Selection of Activity Trace in a Chart and Display in a Data Table in R Shiny
Ggplot2: Have Shorter Tick Marks for Tick Marks Without Labels