Random number generator that produces a power-law distribution?
This page at Wolfram MathWorld discusses how to get a power-law distribution from a uniform distribution (which is what most random number generators provide).
The short answer (derivation at the above link):
x = [(x1^(n+1) - x0^(n+1))*y + x0^(n+1)]^(1/(n+1))
where y is a uniform variate, n is the distribution power, x0 and x1 define the range of the distribution, and x is your power-law distributed variate.
(in excel) randomly generating a power law distribution
For the (type I) Pareto distribution, if the parameters are a min value xm
and an exponent alpha
then the cdf is given by
p = 1 - (xm/x)^alpha
This gives the probability, p
, that the random variable takes on a value which is <= x
. This is easy to invert, so you can use inverse sampling to generate random variables which follow that distribution:
x = xm/(1-p)^(1/alpha) = xm*(1-p)^(-1/alpha)
If p
is uniform over [0,1] then so is 1-p
, so in the above you can just use RAND()
to simulate 1/p. Thus, in Excel if you wanted to e.g. simulate a type-1 Pareto distribution with xm = 2
and alpha = 3
, you would use the formula:
= 2 * RAND()^(-1/3)
If you are going to be doing this sort of thing a lot with different distributions, you might want to consider using R, which can be called directly from Excel using the REXcel add-in. R has a very large number of built-in distributions that it can directly sample from (and it also uses a better underlying random number generator than Excel does).
Drawing random numbers from a power law distribution in R
So there's a few things going on here.
As you hinted at in your question, if you want to compare distributions, you need to truncate moby, so
moby = moby[moby >= m_pl$getXmin()]
Using
density()
is a bit fraught. This is a kernel density smoother, that draws Normal distributions over discrete points. As the powerlaw has a very long tail, this is suspectComparing the tails of two powerlaw distributions is tricky (simulate some data and see).
Anyway, if you run
set.seed(1)
x = dist_rand(m_pl, n = length(moby))
# Cut off the tail for visualisation
moby = moby[moby >= m_pl$getXmin() & moby < 100]
plot(density(moby), log = "xy")
x = x[ x < 100]
lines(density(x), col = 2)
Gives something fairly similar.
Generating power-law distributed numbers from uniform distribution – found 2 approaches: which one is correct?
If y is a uniform random variable between 0 and 1, then 1-y also is. Thereby letting z = 1-y you can transform your formula (1) as :
x = [(x_1^{n+1}-(x_1^{n+1}-x_0^{n+1}) z]^{1/(n+1)}
which is then the same as your formula (2) except for the change n -> (-n).
So I suppose that the only difference between these two formula in the notation on how n relates to the power law decay (unfortunately the link you gave for the Wolfram alpha formula is invalid so I cannot check which notation they use).
Generating a power-law distribution in C and testing it with python
I think you have to make a histogram. I just rewrote your code a bit and it fits very well now
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
double rndm() {
return (double)rand()/(double)RAND_MAX;
}
double power_sample(double xmin, double xmax, int degree) {
double pmin = pow(xmin, degree + 1);
double pmax = pow(xmax, degree + 1);
double v = pmin + (pmax - pmin)*rndm();
return pow(v, 1.0/(degree + 1));
}
int main() {
unsigned int seed = 32345U;
srand(seed);
int xmin = 1;
int xmax = 100;
double* hist = malloc((xmax-xmin + 1)*sizeof(double));
memset(hist, 0, (xmax-xmin + 1)*sizeof(double));
// sampling
int nsamples = 100000000;
for(int k = 0; k != nsamples; ++k) {
double v = power_sample(xmin, xmax, 2);
int idx = (int)v;
hist[idx] += 1.0;
}
// normalization
for(int k = xmin; k != xmax; ++k) {
hist[k] /= (double)nsamples;
}
// output
for(int k = xmin; k != xmax; ++k) {
double x = k + 0.5;
printf(" %e %e\n", x, hist[k]);
}
free(hist); // cleanup
return 0;
}
and fitting code
import numpy
from scipy.optimize import curve_fit
import pylab
def funzione(x, a,b):
return a * numpy.power(x, b)
num, freq = pylab.loadtxt("q.dat", unpack=True)
pars, covm = curve_fit(funzione, num, freq, absolute_sigma=True)
pylab.plot(num, funzione(num, pars[0], pars[1]), color='red')
pylab.errorbar(num, freq, linestyle='', marker='.',color='black')
pylab.show()
print(pars)
and it produced
[ 3.00503372e-06 1.99961571e+00]
which is pretty close
Pseudorandom Number Generator - Exponential Distribution
Since you have access to a uniform random number generator, generating a random number distributed with other distribution whose CDF you know is easy using the inversion method.
So, generate a uniform random number, u
, in [0,1)
, then calculate x
by:
x = log(1-u)/(-λ)
,
where λ
is the rate parameter of the exponential distribution. Now, x
is a random number with an exponential distribution. Note that log
above is ln
, the natural logarithm.
Power Law distribution for a given exponent in C# using MathNet
To generate values of a distribution of your choice, you could use the inverse cumulative distribution function, as mentioned in the wikipedia.
Step by step, this would look like:
- Choose a distribution function that you like.
- Calculate the inverse cumulative distribution function using pen and paper.
- Generate a value based on a uniform distribution on the unit interval.
Random
will do nicely here. - Input the value into your ICDF.
The result is a value chosen at random using your chosen distribution function.
If you run into problems with step 2, maybe the folks at https://mathoverflow.net/ can help you.
Edit: If you just need any power law distribution with exponent gamma, this will generate one value:
double r = 1.0 / Math.Pow(1-new Random().NextDouble(), 1.0/(gamma+1));
Related Topics
Using Boost Thread and a Non-Static Class Function
Receiving Rtsp Stream Using Ffmpeg Library
Compiling with G++ Using Multiple Cores
How to Print to Console When Using Qt
Libraries in /Usr/Local/Lib Not Found
Unique Hardware Id in MAC Os X
How to Avoid Code Duplication Implementing Const and Non-Const Iterators
Converting an Int to Std::String
Easy Rule to Read Complicated Const Declarations
How to Convert from Int to Char*
Why Do We Require Requires Requires
In C++, What Does & Mean After a Function's Return Type
Gnu C++ How to Check When -Std=C++0X Is in Effect
Why Would I Prefer Using Vector to Deque
How to Implement a BéZier Curve in C++
Std::Unique_Ptr, Deleters and the Win32 API
What Is a Reference-To-Pointer
Will a "Variablename;" C++ Statement Be a No-Op at All Times