Random Number Generation in C++11: How to Generate, How Does It Work

Random number generation in C++11: how to generate, how does it work?

The question is way too broad for a complete answer, but let me cherry-pick a couple of interesting points:

Why "equally likely"

Suppose you have a simple random number generator that generate the numbers 0, 1, ..., 10 each with equal probability (think of this as the classic rand()). Now you want a random number in the range 0, 1, 2, each with equal probability. Your knee-jerk reaction would be to take rand() % 3. But wait, the remainders 0 and 1 occur more often than the remainder 2, so this isn't correct!

This is why we need proper distributions, which take a source of uniform random integers and turn them into our desired distribution, like Uniform[0,2] in the example. Best to leave this to a good library!

Engines

Thus at the heart of all randomness is a good pseudo-random number generator that generates a sequence of numbers that uniformly distributed over a certain interval, and which ideally have a very long period. The standard implementation of rand() isn't often the best, and thus it's good to have a choice. Linear-congruential and the Mersenne twister are two good choices (LG is actually often used by rand(), too); again, it's good to let the library handle that.

How it works

Easy: first, set up an engine and seed it. The seed fully determines the entire sequence of "random" numbers, so a) use a different one (e.g. taken from /dev/urandom) each time, and b) store the seed if you wish to recreate a sequence of random choices.

#include <random>

typedef std::mt19937 MyRNG; // the Mersenne Twister with a popular choice of parameters
uint32_t seed_val; // populate somehow

MyRNG rng; // e.g. keep one global instance (per thread)

void initialize()
{
rng.seed(seed_val);
}

Now we can create distributions:

std::uniform_int_distribution<uint32_t> uint_dist;         // by default range [0, MAX]
std::uniform_int_distribution<uint32_t> uint_dist10(0,10); // range [0,10]
std::normal_distribution<double> normal_dist(mean, stddeviation); // N(mean, stddeviation)

...And use the engine to create random numbers!

while (true)
{
std::cout << uint_dist(rng) << " "
<< uint_dist10(rng) << " "
<< normal_dist(rng) << std::endl;

}

Concurrency

One more important reason to prefer <random> over the traditional rand() is that it is now very clear and obvious how to make random number generation threadsafe: Either provide each thread with its own, thread-local engine, seeded on a thread-local seed, or synchronize access to the engine object.

Misc

  • An interesting article on TR1 random on codeguru.
  • Wikipedia has a good summary (thanks, @Justin).
  • In principle, each engine should typedef a result_type, which is the correct integral type to use for the seed. I think I had a buggy implementation once which forced me to force the seed for std::mt19937 to uint32_t on x64, eventually this should be fixed and you can say MyRNG::result_type seed_val and thus make the engine very easily replaceable.

Generate random numbers using C++11 random library

Stephan T. Lavavej(stl) from Microsoft did a talk at Going Native about how to use the new C++11 random functions and why not to use rand(). In it, he included a slide that basically solves your question. I've copied the code from that slide below.

You can see his full talk here:

#include <random>
#include <iostream>

int main() {
std::random_device rd;
std::mt19937 mt(rd());
std::uniform_real_distribution<double> dist(1.0, 10.0);

for (int i=0; i<16; ++i)
std::cout << dist(mt) << "\n";
}

We use random_device once to seed the random number generator named mt. random_device() is slower than mt19937, but it does not need to be seeded because it requests random data from your operating system (which will source from various locations, like RdRand for example).


Looking at this question / answer, it appears that uniform_real_distribution returns a number in the range [a, b), where you want [a, b]. To do that, our uniform_real_distibution should actually look like:

std::uniform_real_distribution<double> dist(1, std::nextafter(10, DBL_MAX));

How to Understand C++11 random number generator

random_device is slow but genuinely random, it's used to generate the 'seed' for the random number sequence.

mt19937 is fast but only 'pseudo random'. It needs a 'seed' to start generating a sequence of numbers. That seed can be random (as in your example) so you get a different sequence of random numbers each time. But it could be a constant, so you get the same sequence of numbers each time.

uniform_int_distribution is a way of mapping random numbers (which could have any values) to the numbers you're actually interested in, in this case a uniform distribution of integers from 1 to 6.

As is often the case with OO programming, this code is about division of responsibilities. Each class contributes a small piece to the overall requirement (the generation of dice rolls). If you wanted to do something different it's easy because you've got all the pieces in front of you.

If this is too much then all you need to do is write a function to capture the overall effect, for instance

int dice_roll()
{
static std::random_device rd;
static std::mt19937 gen(rd());
static std::uniform_int_distribution<> dis(1, 6);
return dis(gen);
}

dis is an example of a function object or functor. It's an object which overloads operator() so it can be called as if it was a function.

Efficient random number generation with C++11 <random>

One thing you can do is to have a permanent distribution object so that you only create the param_type object each time like this:

template<typename Integral>
Integral randint(Integral min, Integral max)
{
using param_type =
typename std::uniform_int_distribution<Integral>::param_type;

// only create these once (per thread)
thread_local static std::mt19937 eng {std::random_device{}()};
thread_local static std::uniform_int_distribution<Integral> dist;

// presumably a param_type is cheaper than a uniform_int_distribution
return dist(eng, param_type{min, max});
}

C++ 11 random number generation not working

The issue here is that std::random_device does not have to really be a random device. It can be a wrapper around an unseeded rand which would give you the same value every time you use it. This means your seed for engine would be the same which means the pseudo-random sequence it generates would be the same as well.

One way you could get around this is to use the current as a seed like

auto seed = std::chrono::system_clock::now().time_since_epoch().count();
mt19937 engine {seed};

But this can be manipulated via external processes and isn't very fined grained so multiple instances seeded at the same time could all make the same sequence.

How to generate a random number in C++?

The most fundamental problem of your test application is that you call srand once and then call rand one time and exit.

The whole point of srand function is to initialize the sequence of pseudo-random numbers with a random seed.

It means that if you pass the same value to srand in two different applications (with the same srand/rand implementation) then you will get exactly the same sequence of rand() values read after that in both applications.

However in your example application pseudo-random sequence consists only of one element - the first element of a pseudo-random sequence generated from seed equal to current time of 1 sec precision. What do you expect to see on output then?

Obviously when you happen to run application on the same second - you use the same seed value - thus your result is the same of course (as Martin York already mentioned in a comment to the question).

Actually you should call srand(seed) one time and then call rand() many times and analyze that sequence - it should look random.

AMENDMENT 1 - example code:

OK I get it.
Apparently verbal description is not enough (maybe language barrier or something... :) ).

Old-fashioned C code example based on the same srand()/rand()/time() functions that was used in the question:

#include <stdlib.h>
#include <time.h>
#include <stdio.h>

int main(void)
{
unsigned long j;
srand( (unsigned)time(NULL) );

for( j = 0; j < 100500; ++j )
{
int n;

/* skip rand() readings that would make n%6 non-uniformly distributed
(assuming rand() itself is uniformly distributed from 0 to RAND_MAX) */
while( ( n = rand() ) > RAND_MAX - (RAND_MAX-5)%6 )
{ /* bad value retrieved so get next one */ }

printf( "%d,\t%d\n", n, n % 6 + 1 );
}

return 0;
}

^^^ THAT sequence from a single run of the program is supposed to look random.

Please NOTE that I don't recommend to use rand/srand functions in production for the reasons explained below and I absolutely don't recommend to use function time as a random seed for the reasons that IMO already should be quite obvious. Those are fine for educational purposes and to illustrate the point sometimes but for any serious use they are mostly useless.

AMENDMENT 2 - detailed explanation:

It is important to understand that as of now there is none C or C++ standard features (library functions or classes) producing actually random data definitively (i.e. guaranteed by the standard to be actually random). The only standard feature that approaches this problem is std::random_device that unfortunately still does not provide guarantees of actual randomness.

Depending on the nature of application you should first decide if you really need truly random (unpredictable) data. Notable case when you do most certainly need true randomness is information security - e.g. generating symmetric keys, asymmetric private keys, salt values, security tokens, etc.

However security-grade random numbers is a separate industry worth a separate article. I'm briefly discussing them in this answer of mine.

In most cases Pseudo-Random Number Generator is sufficient - e.g. for scientific simulations or games. In some cases consistently defined pseudo-random sequence is even required - e.g. in games you may choose to generate exactly same maps in runtime to avoid storing lots of data in your installation pack.

The original question and reoccurring multitude of identical/similar questions (and even many misguided "answers" to them) indicate that first and foremost it is important to distinguish random numbers from pseudo-random numbers AND to understand what is pseudo-random number sequence in the first place AND to realize that pseudo-random number generators are NOT used the same way you could use true random number generators.

Intuitively when you request random number - the result returned shouldn't depend on previously returned values and shouldn't depend if
anyone requested anything before and shouldn't depend in what moment
and by what process and on what computer and from what generator and
in what galaxy it was requested. That is what word "random" means
after all - being unpredictable and independent of anything -
otherwise it is not random anymore, right? With this intuition it is
only natural to search the web for some magic spells to cast to get
such random number in any possible context.

^^^ THAT kind of intuitive expectations IS VERY WRONG and harmful in all cases involving Pseudo-Random Number Generators - despite being reasonable for true random numbers.

While the meaningful notion of "random number" exists (kind of) - there is no such thing as "pseudo-random number". A Pseudo-Random Number Generator actually produces pseudo-random number sequence.

Pseudo-random sequence is in fact always deterministic (predetermined by its algorithm and initial parameters) - i.e. there is actually nothing random about it.

When experts talk about quality of PRNG they actually talk about statistical properties of the generated sequence (and its notable sub-sequences). For example if you combine two high quality PRNGs by using them both in turns - you may produce bad resulting sequence - despite them generating good sequences each separately (those two good sequences may simply correlate to each other and thus combine badly).

Specifically rand()/srand(s) pair of functions provide a singular per-process non-thread-safe(!) pseudo-random number sequence generated with implementation-defined algorithm. Function rand() produces values in range [0, RAND_MAX].

Quote from C11 standard (ISO/IEC 9899:2011):

The srand function uses the argument as a seed for a new sequence of
pseudo-random numbers to be returned by subsequent calls to rand. If
srand is then called with the same seed value, the sequence of
pseudo-random numbers shall be repeated. If rand is called before any
calls to srand have been made, the same sequence shall be generated as
when srand is first called with a seed value of 1.

Many people reasonably expect that rand() would produce a sequence of semi-independent uniformly distributed numbers in range 0 to RAND_MAX. Well it most certainly should (otherwise it's useless) but unfortunately not only standard doesn't require that - there is even explicit disclaimer that states "there is no guarantees as to the quality of the random sequence produced".
In some historical cases rand/srand implementation was of very bad quality indeed. Even though in modern implementations it is most likely good enough - but the trust is broken and not easy to recover.
Besides its non-thread-safe nature makes its safe usage in multi-threaded applications tricky and limited (still possible - you may just use them from one dedicated thread).

New class template std::mersenne_twister_engine<> (and its convenience typedefs - std::mt19937/std::mt19937_64 with good template parameters combination) provides per-object pseudo-random number generator defined in C++11 standard. With the same template parameters and the same initialization parameters different objects will generate exactly the same per-object output sequence on any computer in any application built with C++11 compliant standard library. The advantage of this class is its predictably high quality output sequence and full consistency across implementations.

Also there are more PRNG engines defined in C++11 standard - std::linear_congruential_engine<> (historically used as fair quality srand/rand algorithm in some C standard library implementations) and std::subtract_with_carry_engine<>. They also generate fully defined parameter-dependent per-object output sequences.

Modern day C++11 example replacement for the obsolete C code above:

#include <iostream>
#include <chrono>
#include <random>

int main()
{
std::random_device rd;
// seed value is designed specifically to make initialization
// parameters of std::mt19937 (instance of std::mersenne_twister_engine<>)
// different across executions of application
std::mt19937::result_type seed = rd() ^ (
(std::mt19937::result_type)
std::chrono::duration_cast<std::chrono::seconds>(
std::chrono::system_clock::now().time_since_epoch()
).count() +
(std::mt19937::result_type)
std::chrono::duration_cast<std::chrono::microseconds>(
std::chrono::high_resolution_clock::now().time_since_epoch()
).count() );

std::mt19937 gen(seed);

for( unsigned long j = 0; j < 100500; ++j )
/* ^^^Yes. Generating single pseudo-random number makes no sense
even if you use std::mersenne_twister_engine instead of rand()
and even when your seed quality is much better than time(NULL) */
{
std::mt19937::result_type n;
// reject readings that would make n%6 non-uniformly distributed
while( ( n = gen() ) > std::mt19937::max() -
( std::mt19937::max() - 5 )%6 )
{ /* bad value retrieved so get next one */ }

std::cout << n << '\t' << n % 6 + 1 << '\n';
}

return 0;
}

The version of previous code that uses std::uniform_int_distribution<>

#include <iostream>
#include <chrono>
#include <random>

int main()
{
std::random_device rd;
std::mt19937::result_type seed = rd() ^ (
(std::mt19937::result_type)
std::chrono::duration_cast<std::chrono::seconds>(
std::chrono::system_clock::now().time_since_epoch()
).count() +
(std::mt19937::result_type)
std::chrono::duration_cast<std::chrono::microseconds>(
std::chrono::high_resolution_clock::now().time_since_epoch()
).count() );

std::mt19937 gen(seed);
std::uniform_int_distribution<unsigned> distrib(1, 6);

for( unsigned long j = 0; j < 100500; ++j )
{
std::cout << distrib(gen) << ' ';
}

std::cout << '\n';
return 0;
}

Generate random numbers in C++-11 in different parts of a code using the same seed

Since you want to use the same engine, you have to use the same engine. (That much is a singleton.) Pass a reference to RNG, store and use a reference within RNG. Minor changes to your code make this so (one of the comments already pointed this out):

ENG ŋ
RNG(ENG &eng_, int imin, int imax)
: idist(imin, imax), eng(eng_) {}

void myfunc(ENG &eng_, int imin, int imax, int N)

But I like it better if the engine is hidden in RNG like this:

class RNG {
private:
static ENG eng;
iDIST idist;
public:
static void seed(int s) { eng.seed(s); }
RNG(int imin, int imax) : idist(imin, imax) {}
int generate() { return idist(eng); }
};

// the one generator, stored as RNG::eng
ENG RNG::eng;

// print some generated numbers from a range
void printRandomNumbers(int imin, int imax, int N){
std::cout << "Range = [" << imin << "," << imax << "]" << std::endl;
RNG myirn(imin, imax);
for (int i = 0; i < N; i++){
std::cout << myirn.generate() << std::endl;
}
return;
}

int main()
{
//Seed globally
int myseed = 1;
RNG::seed(myseed);
printRandomNumbers(1, 10, 5);
printRandomNumbers(11, 20, 5);
printRandomNumbers(21, 30, 5);
return 0;
}

C++11 random numbers

This is how to use the C++11 random number generation for this purpose (adjusted from http://en.cppreference.com/w/cpp/numeric/random/uniform_int_distribution):

#include <random>
#include <iostream>
int main()
{
/* Initialise. Do this once (not for every
random number). */
std::random_device rd;
std::mt19937_64 gen(rd());

/* This is where you define the number generator for unsigned long long: */
std::uniform_int_distribution<unsigned long long> dis;

/* A few random numbers: */
for (int n=0; n<10; ++n)
std::cout << dis(gen) << ' ';
std::cout << std::endl;
return 0;
}

Instead of unsigned long long, you could use std::uintmax_t from cstdint to get the largest possible integer range (without using an actual big-integer library).



Related Topics



Leave a reply



Submit