Stacked Histograms Like in Flow Cytometry

Stacked histograms like in flow cytometry

require(ggplot2)
require(plyr)

my.data <- as.data.frame(rbind( cbind( rnorm(1e3), 1) , cbind( rnorm(1e3)+2, 2), cbind( rnorm(1e3)+3, 3), cbind( rnorm(1e3)+4, 4)))
my.data$V2=as.factor(my.data$V2)

calculate the density depending on V2

res <- dlply(my.data, .(V2), function(x) density(x$V1))
dd <- ldply(res, function(z){
data.frame(Values = z[["x"]],
V1_density = z[["y"]],
V1_count = z[["y"]]*z[["n"]])
})

add an offset depending on V2

dd$offest=-as.numeric(dd$V2)*0.2 # adapt the 0.2 value as you need
dd$V1_density_offest=dd$V1_density+dd$offest

and plot

ggplot(dd, aes(Values, V1_density_offest, color=V2)) + 
geom_line()+
geom_ribbon(aes(Values, ymin=offest,ymax=V1_density_offest, fill=V2),alpha=0.3)+
scale_y_continuous(breaks=NULL)

results

Stacked Histograms in R

I think you might have the best luck with the 'ggplot2' package, and the chart you're looking for is a "stacked bar chart" and not a histogram.

Setup: Create some sample data.

data <- data.frame(age=sample(c("15-19", "20-24", "25-29","30-34"),100,rep=TRUE), ratio=rnorm(100,mean=1,sd=0.3))

Plot it: We can just use the 'qplot' function here.

library(ggplot2)
qplot(ratio, data=data, geom="bar", fill=age, binwidth=0.1)

Here, we tell the 'qplot' function to use the [ratio] data from our [data] data frame and to plot it in a bar chart geometry. The data should be split and colored by the [age] (fill=age), and each bar should be 0.1 wide. You should be able to adjust this to your needs.

Vertically stack density plots with ggplot2

I would use facet_grid instead of facet_wrap to achieve this, but that is the easiest method in ggplot2

Here's a working example:

diamonds %>% 
filter(cut %in% c('Ideal','Premium','Very Good')) %>%
ggplot(aes(carat)) +
geom_density() +
facet_grid(cut ~ .)

Should give this result (as of ggplot 3.3.0):

Sample Image

Stacked Histograms Using R Base Graphics

You can generate both plots with barplot(), based on a frequency table of Species and Sepal.Length.

# Create frequency table
tab <- table(iris$Species, iris$Sepal.Length)

# Stacked barplot
barplot(tab)

Sample Image

# Stacked percent barplot
barplot(prop.table(tab, 2)) # Need to convert to marginal table first

Sample Image

Plot staggered histograms/lines as in FACS

Is this the sort of thing you want?

stacked plots

What I did was define the y-distance between the baselines of each curve. For the ith curve, I calculated the minimum Y-value, then set that minimum to be i times the y-distance, adjusting the height of the entire curve accordingly. I used a decreasing z-order to ensure that the filled part of the curves were not obscured by the baselines.

Here's the code:

import numpy as np
import matplotlib.pyplot as plt

delta_Y = .5

zorder = 0
for i, Y in enumerate(data):
baseline = min(Y)
#change needed for minimum of Y to be delta_Y above previous curve
y_change = delta_Y * i - baseline
Y = Y + y_change
plt.fill_between(np.linspace(0, 1000, 1000), Y, np.ones(1000) * delta_Y * i, zorder = zorder)
zorder -= 1

Code that generates dummy data:

def gauss(X):
return np.exp(-X**2 / 2.0)

#create data
X = np.linspace(-10, 10, 100)
data = []
for i in xrange(10):
arr = np.zeros(1000)
arr[i * 100: i * 100 + 100] = gauss(X)
data.append(arr)
data.reverse()

how to mimic histogram plot from flowjo in R using flowCore?

The reason that for the "shift" is that the x axis is logarithmic (base 10) in the flowJo graph. To achieve the same result in R, add

+ scale_x_log10()

after the existing code. This might interact weirdly with the axis limits you've set, so bare that in mind.

To make the y-axis "count" rather than density, you can change the first line of your ggcyto() call to:

aes(x= `UV-379-A`, y = after_stat(count)) 

Let me know if that works - I don't have your data to hand so that's all from memory!

For any purely aesthetic changes, they are relatively easy to look up.

Histograms and Density Plots do not match up

While there is no data sample to reproduce the error, you could try to
make sure that the environment used by geom_density is correct by specifying it explicitly. You can also try to move the code line specifying the density (geom_density) just after the geom_histogram. Also, the y-axis label is probably wrong - it is now set as counts, while values suggest that is in fact density.

How would I specify density explicitly?

You can specify the density parameters explicitly by specifying data, aes and position directly in geom_density function call, so it would use these stated instead of inherited arguments:

ggplot() + 
geom_histogram(data=df.half, aes(x=time,y=..density..),position="identity", alpha=0.5,binwidth=1)+
geom_density(data=df.half,aes(x=time,y=..density..))+
geom_vline(data=sumy.df.half,aes(xintercept=grp.mean),color="blue", linetype="dashed", size=1)+
facet_grid(SUB_NUMBER ~ .)

I do not understand how it occured in the first place
I think in your initial code for geom_density, you have explicitly specified just the alpha argument. Thus for all of the rest of the parameters it needed, (data, aes, position etc) it used the inherited arguments/parameters and apparently it did not inherit them correctly. Probably it tried to use the data argument from the geom_vline function - sumy.df.half , or was confused by the syntaxis in argument "..density.."



Related Topics



Leave a reply



Submit