Random Number Generator That Produces a Power-Law Distribution

Random number generator that produces a power-law distribution?

This page at Wolfram MathWorld discusses how to get a power-law distribution from a uniform distribution (which is what most random number generators provide).

The short answer (derivation at the above link):

x = [(x1^(n+1) - x0^(n+1))*y + x0^(n+1)]^(1/(n+1))

where y is a uniform variate, n is the distribution power, x0 and x1 define the range of the distribution, and x is your power-law distributed variate.

(in excel) randomly generating a power law distribution

For the (type I) Pareto distribution, if the parameters are a min value xm and an exponent alpha then the cdf is given by

p = 1 - (xm/x)^alpha

This gives the probability, p, that the random variable takes on a value which is <= x. This is easy to invert, so you can use inverse sampling to generate random variables which follow that distribution:

x = xm/(1-p)^(1/alpha) = xm*(1-p)^(-1/alpha)

If p is uniform over [0,1] then so is 1-p, so in the above you can just use RAND() to simulate 1/p. Thus, in Excel if you wanted to e.g. simulate a type-1 Pareto distribution with xm = 2 and alpha = 3, you would use the formula:

= 2 * RAND()^(-1/3)

If you are going to be doing this sort of thing a lot with different distributions, you might want to consider using R, which can be called directly from Excel using the REXcel add-in. R has a very large number of built-in distributions that it can directly sample from (and it also uses a better underlying random number generator than Excel does).

Drawing random numbers from a power law distribution in R

So there's a few things going on here.

  1. As you hinted at in your question, if you want to compare distributions, you need to truncate moby, so moby = moby[moby >= m_pl$getXmin()]

  2. Using density() is a bit fraught. This is a kernel density smoother, that draws Normal distributions over discrete points. As the powerlaw has a very long tail, this is suspect

  3. Comparing the tails of two powerlaw distributions is tricky (simulate some data and see).

Anyway, if you run

set.seed(1)
x = dist_rand(m_pl, n = length(moby))
# Cut off the tail for visualisation
moby = moby[moby >= m_pl$getXmin() & moby < 100]
plot(density(moby), log = "xy")
x = x[ x < 100]
lines(density(x), col = 2)

Gives something fairly similar.

Generating power-law distributed numbers from uniform distribution – found 2 approaches: which one is correct?

If y is a uniform random variable between 0 and 1, then 1-y also is. Thereby letting z = 1-y you can transform your formula (1) as :

x = [(x_1^{n+1}-(x_1^{n+1}-x_0^{n+1}) z]^{1/(n+1)}

which is then the same as your formula (2) except for the change n -> (-n).

So I suppose that the only difference between these two formula in the notation on how n relates to the power law decay (unfortunately the link you gave for the Wolfram alpha formula is invalid so I cannot check which notation they use).

Generating a power-law distribution in C and testing it with python

I think you have to make a histogram. I just rewrote your code a bit and it fits very well now

#include <math.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>

double rndm() {
return (double)rand()/(double)RAND_MAX;
}

double power_sample(double xmin, double xmax, int degree) {
double pmin = pow(xmin, degree + 1);
double pmax = pow(xmax, degree + 1);
double v = pmin + (pmax - pmin)*rndm();
return pow(v, 1.0/(degree + 1));
}

int main() {
unsigned int seed = 32345U;
srand(seed);

int xmin = 1;
int xmax = 100;

double* hist = malloc((xmax-xmin + 1)*sizeof(double));
memset(hist, 0, (xmax-xmin + 1)*sizeof(double));

// sampling
int nsamples = 100000000;
for(int k = 0; k != nsamples; ++k) {
double v = power_sample(xmin, xmax, 2);
int idx = (int)v;
hist[idx] += 1.0;
}

// normalization
for(int k = xmin; k != xmax; ++k) {
hist[k] /= (double)nsamples;
}

// output
for(int k = xmin; k != xmax; ++k) {
double x = k + 0.5;
printf(" %e %e\n", x, hist[k]);
}

free(hist); // cleanup

return 0;
}

and fitting code

import numpy
from scipy.optimize import curve_fit
import pylab

def funzione(x, a,b):
return a * numpy.power(x, b)

num, freq = pylab.loadtxt("q.dat", unpack=True)

pars, covm = curve_fit(funzione, num, freq, absolute_sigma=True)
pylab.plot(num, funzione(num, pars[0], pars[1]), color='red')
pylab.errorbar(num, freq, linestyle='', marker='.',color='black')
pylab.show()
print(pars)

and it produced

[  3.00503372e-06   1.99961571e+00]

which is pretty close

Pseudorandom Number Generator - Exponential Distribution

Since you have access to a uniform random number generator, generating a random number distributed with other distribution whose CDF you know is easy using the inversion method.

So, generate a uniform random number, u, in [0,1), then calculate x by:

x = log(1-u)/(-λ),

where λ is the rate parameter of the exponential distribution. Now, x is a random number with an exponential distribution. Note that log above is ln, the natural logarithm.

Power Law distribution for a given exponent in C# using MathNet

To generate values of a distribution of your choice, you could use the inverse cumulative distribution function, as mentioned in the wikipedia.

Step by step, this would look like:

  1. Choose a distribution function that you like.
  2. Calculate the inverse cumulative distribution function using pen and paper.
  3. Generate a value based on a uniform distribution on the unit interval. Random will do nicely here.
  4. Input the value into your ICDF.

The result is a value chosen at random using your chosen distribution function.

If you run into problems with step 2, maybe the folks at https://mathoverflow.net/ can help you.

Edit: If you just need any power law distribution with exponent gamma, this will generate one value:

double r = 1.0 / Math.Pow(1-new Random().NextDouble(), 1.0/(gamma+1));


Related Topics



Leave a reply



Submit