Generate Correlated Random Numbers from Binomial Distributions

Generate correlated random numbers from binomial distributions

You can generate correlated uniforms using the copula package, then use the qbinom function to convert those to binomial variables. Here is one quick example:

library(copula)

tmp <- normalCopula( 0.75, dim=2 )
x <- rcopula(tmp, 1000)
x2 <- cbind( qbinom(x[,1], 10, 0.5), qbinom(x[,2], 15, 0.7) )

Now x2 is a matrix with the 2 columns representing 2 binomial variables that are correlated.

Can't generate correlated random numbers from binomial distributions in R using rmvbin

This is a math/stats "problem" and not an R problem (In the sense that it's not a problem but a consequence of the model)

Short version: For bivariate binary data there is a link between the marginal probabilities and the correlation that can be observed. You can see it if you do a bit of boring juggling with the marginal probabilities $p_A$ and $p_B$ and the simultaneous probability $p_{AB}$. In other words: the marginal probabilities put restrictions on range of allowed correlations (and vice versa), and you are violating this in your call.

For bivariate Gaussian random variables the marginals and the correlations are separate and can be specified independently of each other.

The question should probably be moved to stats exchange.

How to generate correlated data for arbitrary distributions

This is a tricky problem, but you can do it by (1) find the spearman rank correlation you need, (2) generate values from a uniform distribution with this pair-wise correlation, then (3) use the values from this sample as ranks in your arbitrary distributions, to generate values from those distributions. See my paper using this technique at http://ee.hawaii.edu/~mfripp/papers/Fripp_2011_Wind_Reserves.pdf (section 2.2).

If you need more than the right pair-wise rank correlation, you may be able to do it by generating uniformly distributed tuples (one element for each random variable), then using some technique to nudge them into the right correlation structure, then use them as ranks for the arbitrary distributions. That is in the area of copula methods.



Related Topics



Leave a reply



Submit