Fitting a Density Curve to a Histogram in R

Fitting a density curve to a histogram in R

If I understand your question correctly, then you probably want a density estimate along with the histogram:

X <- c(rep(65, times=5), rep(25, times=5), rep(35, times=10), rep(45, times=4))
hist(X, prob=TRUE) # prob=TRUE for probabilities not counts
lines(density(X)) # add a density estimate with defaults
lines(density(X, adjust=2), lty="dotted") # add another "smoother" density

Edit a long while later:

Here is a slightly more dressed-up version:

X <- c(rep(65, times=5), rep(25, times=5), rep(35, times=10), rep(45, times=4))
hist(X, prob=TRUE, col="grey")# prob=TRUE for probabilities not counts
lines(density(X), col="blue", lwd=2) # add a density estimate with defaults
lines(density(X, adjust=2), lty="dotted", col="darkgreen", lwd=2)

along with the graph it produces:

Sample Image

How to fit a curve to a histogram

OK, so you are just struggling with the fact that density goes beyond "natural range". Well, just set cut = 0. You possibly want to read plot.density extends “xlim” beyond the range of my data. Why and how to fix it? for why. In that answer, I was using from and to. But now I am using cut.

## consider a mixture, that does not follow any parametric distribution family
## note, by construction, this is a strictly positive random variable
set.seed(0)
x <- rbeta(1000, 3, 5) + rexp(1000, 0.5)

## (kernel) density estimation offers a flexible nonparametric approach
d <- density(x, cut = 0)

## you can plot histogram and density on the density scale
hist(x, prob = TRUE, breaks = 50)
lines(d, col = 2)

Sample Image

Note, by cut = 0, density estimation is done strictly within range(x). Outside this range, density is 0.

Fitting Density Curves to Histograms put into a Pairs Plot in R

Using the iris data set that comes with R and simplifying your panel function, this seems to work:

data(iris)
panel.hist = function(x, ...) {
usr = par("usr"); on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5))
hist(x, freq = FALSE, col="cyan", add=TRUE)
lines(density(x))
}
pairs(iris[, 1:4], upper.panel = panel.smooth, diag.panel = panel.hist)

You did not provide panel.cor or panel.smooth functions.

Pairs Plot

Overlay normal curve to histogram in R

Here's a nice easy way I found:

h <- hist(g, breaks = 10, density = 10,
col = "lightgray", xlab = "Accuracy", main = "Overall")
xfit <- seq(min(g), max(g), length = 40)
yfit <- dnorm(xfit, mean = mean(g), sd = sd(g))
yfit <- yfit * diff(h$mids[1:2]) * length(g)

lines(xfit, yfit, col = "black", lwd = 2)

Overlay histogram with density curve

Here you go!

# create some data to work with
x = rnorm(1000);

# overlay histogram, empirical density and normal density
p0 = qplot(x, geom = 'blank') +
geom_line(aes(y = ..density.., colour = 'Empirical'), stat = 'density') +
stat_function(fun = dnorm, aes(colour = 'Normal')) +
geom_histogram(aes(y = ..density..), alpha = 0.4) +
scale_colour_manual(name = 'Density', values = c('red', 'blue')) +
theme(legend.position = c(0.85, 0.85))

print(p0)

Fit curve to histogram ggplot

Depending on your goals, something like this may work by just scaling the density curve using multiplication:

ggplot(df, aes(x=x)) + geom_histogram() + geom_density(aes(y=..density..*10))

or

ggplot(df, aes(x=x)) + geom_histogram() + geom_density(aes(y=..count../10))

Choose other values (instead of 10) if you want to scale things differently.

Edit:

Since you are defining your scaling factor in the global environment, you can define it within aes:

ggplot(df, aes(x=x)) + geom_histogram() + geom_density(aes(n=n, y=..density..*n)) 
# or
ggplot(df, aes(x=x, n=n)) + geom_histogram() + geom_density(aes(y=..density..*n))

or another, less nice way using get:

ggplot(df, aes(x=x)) + 
geom_histogram() +
geom_density(aes(y=..density.. * get("n", pos = .GlobalEnv)))


Related Topics



Leave a reply



Submit