How to Correctly Interpret Ggplot's Stat_Density2D

How to correctly interpret ggplot's stat_density2d

HPDregionplot in package:emdbook is supposed to do that. It does use MASS::kde2d but it normalizes the result. It has the disadvantage to my mind that it requires an mcmc object.

library(MASS)
library(coda)
HPDregionplot(mcmc(data.matrix(df)), prob=0.8)
with(df, points(x,y))

Sample Image

Follow up to stat_contour_2d bins - interpretation

I'm not sure this fully answers your question, but there has been a change in behaviour between ggplot v3.2.1 and v3.3.0 due to the way the contour bins are calculated. In the earlier version, the bins are calculated in StatContour$compute_group, whereas in the later version, StatContour$compute_group delegates this task to the unexported function contour_breaks. In contour_breaks, the bin widths are calculated by the density range divided by bins - 1, whereas in the earlier version they are calculated by the range divided by bins.

We can revert this behaviour by temporarily changing the contour_breaks function:

Before

ggplot() +
  stat_density_2d(data = foo, aes(x, y), bins = 5, color = "black") +
  geom_point(data = foo, aes(x = x, y = y)) +
  geom_polygon(data = df_contours, aes(x = x, y = y, color = prob), fill = NA) +
  scale_color_brewer(name = "Probs", palette = "Set1")

Sample Image

Now change the divisor in contour_breaks from bins - 1 to bins:

my_fun <- ggplot2:::contour_breaks
body(my_fun)[[4]][[3]][[2]][[3]][[3]] <- quote(bins)
assignInNamespace("contour_breaks", my_fun, ns = "ggplot2", pos = "package:ggplot2")

After

Using exactly the same code as produced the first plot:

ggplot() +
  stat_density_2d(data = foo, aes(x, y), bins = 5, color = "black") +
  geom_point(data = foo, aes(x = x, y = y)) +
  geom_polygon(data = df_contours, aes(x = x, y = y, color = prob), fill = NA) +
  scale_color_brewer(name = "Probs", palette = "Set1")

Sample Image

Why is bins parameter unknown for the stat_density2d function? (ggmap)

Okay, adding this one as a second answer because I think the descriptions and comments in the first answer are useful and I don't feel like merging them. Basically I figured there must be an easy way to restore the regressed functionality. And after awhile, and learning some basics about ggplot2, I got this to work by overriding some ggplot2 functions:

library(ggmap)
library(ggplot2)

# -------------------------------
# start copy from stat-density-2d.R

stat_density_2d <- function(mapping = NULL, data = NULL, geom = "density_2d",
                            position = "identity", contour = TRUE,
                            n = 100, h = NULL, na.rm = FALSE,bins=0,
                            show.legend = NA, inherit.aes = TRUE, ...) {
  layer(
    data = data,
    mapping = mapping,
    stat = StatDensity2d,
    geom = geom,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      contour = contour,
      n = n,
      bins=bins,
      ...
    )
  )
}

stat_density2d <- stat_density_2d

StatDensity2d <- 
  ggproto("StatDensity2d", Stat,
          default_aes = aes(colour = "#3366FF", size = 0.5),

          required_aes = c("x", "y"),

          compute_group = function(data, scales, na.rm = FALSE, h = NULL,
                                   contour = TRUE, n = 100,bins=0) {
            if (is.null(h)) {
              h <- c(MASS::bandwidth.nrd(data$x), MASS::bandwidth.nrd(data$y))
            }

            dens <- MASS::kde2d(
              data$x, data$y, h = h, n = n,
              lims = c(scales$x$dimension(), scales$y$dimension())
            )
            df <- data.frame(expand.grid(x = dens$x, y = dens$y), z = as.vector(dens$z))
            df$group <- data$group[1]

            if (contour) {
              #  StatContour$compute_panel(df, scales,bins=bins,...) # bad dots...
              if (bins>0){
                StatContour$compute_panel(df, scales,bins)
              } else {
                StatContour$compute_panel(df, scales)
              }
            } else {
              names(df) <- c("x", "y", "density", "group")
              df$level <- 1
              df$piece <- 1
              df
            }
          }
  )

# end copy from stat-density-2d.R
# -------------------------------

set.seed(1)
n=100

df <- data.frame(x=rnorm(n, 0, 1), y=rnorm(n, 0, 1))

TestData <- ggplot (data = df) +
  stat_density2d(aes(x = x, y = y,fill = as.factor(..level..)),bins=5,geom = "polygon") +
  geom_point(aes(x = x, y = y)) +
  scale_fill_manual(values = c("yellow","red","green","royalblue", "black"))
print(TestData)

Which yields the result. Note that varying the bins parameter has the desired effect now, which cannot be replicated by varying the n parameter.

Sample Image

colouring density of stat_density2d in ggplot with ggmap

I had to figure this out just last week for work. geom_density2d creates a path, which by definition has no fill. To get a fill, you need a polygon. So instead of geom_density2d, you need to call stat_density2d(geom = "polygon").

The dataframe is just random data that gave a nice density; you should be able to adapt the same for your map (I was making a very similar map for work, so using stat_density2d should be fine).

Also note calc(level) is the replacement for ..level.., but I think it's only in the github version of ggplot2, so if you're using the CRAN version, just swap that for the older ..level..

library(tidyverse)

set.seed(1234)
df <- tibble(
    x = c(rnorm(100, 1, 3), rnorm(50, 2, 2.5)),
    y = c(rnorm(80, 5, 4), rnorm(30, 15, 2), rnorm(40, 2, 1)) %>% sample(150)
)

ggplot(df, aes(x = x, y = y)) +
    stat_density2d(aes(fill = calc(level)), geom = "polygon") +
    geom_point()

Created on 2018-04-26 by the reprex package (v0.2.0).

ggplot stat_density2d can't plot contour with tile geoms

In ggplot, geoms and stats must be paired in each layer added to the plot, so if you want both rasters/tiles and contour lines, you need to make two calls:

library(ggplot2)

ggplot(faithful, aes(x = eruptions, y = waiting)) + 
    stat_density2d(aes(fill = ..density..), geom = "raster", contour = FALSE) + 
    stat_density2d()

If you're aiming instead for filled contours, it's really hard without extending ggplot. Happily, that has already been done in the metR package:

ggplot(faithfuld, aes(eruptions, waiting, z = density)) + 
    metR::geom_contour_fill()

Note that I switched to faithfuld, which already has the density computed, as geom_contour_fill, like geom_contour, is designed to work with raster data. It may be possible to get geom_contour_fill to do the 2D density estimation for you, but it may be more straightforward to call MASS::kde2d (what stat_density2d uses) yourself and unpack the results to a data frame suitable for geom_contour_fill.

How to Correctly Interpret Ggplot's Stat_Density2D