C++11 Cross Compiler/Standard Library Random Distribution Reproducibility

C++11 cross compiler/standard library random distribution reproducibility

Unfortunately, as of N3936 (C++14 final draft), no standard provided random distribution has a requirement like this. And it's easy to see why. There are many valid ways of writing a distribution function. Some better than others. And algorithms for even something as basic as the normal distribution are getting better and are the subject of active research. Mandating use of a single one would unnecessarily roadblock the implementation of future algorithms.

Fortunately, you can write your own. The specification for the headers of the various distribution classes are located under §26.5.8. But there is no reason why yours needs to follow this structure necessarily.

(Please note, I have not tested this code thoroughly, and there might be bad behavior with certain engines, or with overflow, though I have taken some pains to avoid the latter, this is intended more as an illustrative example than a canonical source of awesome uniform distribution. That being said, if you find anything wrong with it, let me know in the comments and I'll be happy to correct it.)

#include <random>
#include <tuple>
#include <iostream>

template<class IntType = int>
class my_uniform_int_distribution {
public:
// types
typedef IntType result_type;
typedef std::pair<int, int> param_type;

// constructors and reset functions
explicit my_uniform_int_distribution(IntType a = 0, IntType b = std::numeric_limits<IntType>::max());
explicit my_uniform_int_distribution(const param_type& parm);
void reset();

// generating functions
template<class URNG>
result_type operator()(URNG& g);
template<class URNG>
result_type operator()(URNG& g, const param_type& parm);

// property functions
result_type a() const;
result_type b() const;
param_type param() const;
void param(const param_type& parm);
result_type min() const;
result_type max() const;

private:
typedef typename std::make_unsigned<IntType>::type diff_type;

IntType lower;
IntType upper;
};

template<class IntType>
my_uniform_int_distribution<IntType>::my_uniform_int_distribution(IntType a, IntType b) {
param({a, b});
}

template<class IntType>
my_uniform_int_distribution<IntType>::my_uniform_int_distribution(const param_type& parm) {
param(parm);
}

template<class IntType>
void my_uniform_int_distribution<IntType>::reset() {}

template<class IntType>
template<class URNG>
auto my_uniform_int_distribution<IntType>::operator()(URNG& g) -> result_type {
return operator()(g, param());
}

template<class IntType>
template<class URNG>
auto my_uniform_int_distribution<IntType>::operator()(URNG& g, const param_type& parm) -> result_type {
diff_type diff = (diff_type)parm.second - (diff_type)parm.first + 1;
if (diff == 0) // If the +1 overflows we are using the full range, just return g()
return g();

diff_type badDistLimit = std::numeric_limits<diff_type>::max() / diff;
do {
diff_type generatedRand = g();

if (generatedRand / diff < badDistLimit)
return (IntType)((generatedRand % diff) + (diff_type)parm.first);
} while (true);
}

template<class IntType>
auto my_uniform_int_distribution<IntType>::a() const -> result_type {
return lower;
}

template<class IntType>
auto my_uniform_int_distribution<IntType>::b() const -> result_type {
return upper;
}

template<class IntType>
auto my_uniform_int_distribution<IntType>::param() const -> param_type {
return {lower, upper};
}

template<class IntType>
void my_uniform_int_distribution<IntType>::param(const param_type& parm) {
std::tie(lower, upper) = parm;
if (upper < lower)
throw std::exception();
}

template<class IntType>
auto my_uniform_int_distribution<IntType>::min() const -> result_type {
return lower;
}

template<class IntType>
auto my_uniform_int_distribution<IntType>::max() const -> result_type {
return upper;
}

int main() {
std::mt19937 foo;
my_uniform_int_distribution<int> bar(0,1000);

for (int i=0; i<99; ++i) {
bar(foo);
}

std::cout << bar(foo) << std::endl;

return 0;
}

This code prints out 490 on all platforms I've tested.

Mersenne Twister Reproducibility across Compilers

The numbers that engines generate are guaranteed to be reproducible across implementations, but the distributions are not. (source: rand() considered harmful).

The N3337 standard draft says this about normal_distribution (26.5.8.5.1):

A normal_distribution random number distribution produces random numbers x distributed according to the probability density function

Sample Image

The distribution parameters µ and σ are also known as this distribution’s mean and standard deviation

And... that's it. It doesn't specify the order of the generated numbers, nor algorithm, nor example outputs.

The standard is very elaborate about mersenne_twister_engine (26.5.3.2), it specifies the state transition function, initial seeding algorithm and so on.

Why is the new random library better than std::rand()?

Pretty much any implementation of "old" rand() use an LCG; while they are generally not the best generators around, usually you are not going to see them fail on such a basic test - mean and standard deviation is generally got right even by the worst PRNGs.

Common failings of "bad" - but common enough - rand() implementations are:

  • low randomness of low-order bits;
  • short period;
  • low RAND_MAX;
  • some correlation between successive extractions (in general, LCGs produce numbers that are on a limited number of hyperplanes, although this can be somehow mitigated).

Still, none of these are specific to the API of rand(). A particular implementation could place a xorshift-family generator behind srand/rand and, algoritmically speaking, obtain a state of the art PRNG with no changes of interface, so no test like the one you did would show any weakness in the output.

Edit: @R. correctly notes that the rand/srand interface is limited by the fact that srand takes an unsigned int, so any generator an implementation may put behind them is intrinsically limited to UINT_MAX possible starting seeds (and thus generated sequences). This is true indeed, although the API could be trivially extended to make srand take an unsigned long long, or adding a separate srand(unsigned char *, size_t) overload.


Indeed, the actual problem with rand() is not much of implementation in principle but:

  • backwards compatibility; many current implementations use suboptimal generators, typically with badly chosen parameters; a notorious example is Visual C++, which sports a RAND_MAX of just 32767. However, this cannot be changed easily, as it would break compatibility with the past - people using srand with a fixed seed for reproducible simulations wouldn't be too happy (indeed, IIRC the aforementioned implementation goes back to Microsoft C early versions - or even Lattice C - from the mid-eighties);
  • simplistic interface; rand() provides a single generator with the global state for the whole program. While this is perfectly fine (and actually quite handy) for many simple use cases, it poses problems:

    • with multithreaded code: to fix it you either need a global mutex - which would slow down everything for no reason and kill any chance of repeatability, as the sequence of calls becomes random itself -, or thread-local state; this last one has been adopted by several implementations (notably Visual C++);
    • if you want a "private", reproducible sequence into a specific module of your program that doesn't impact the global state.

Finally, the rand state of affairs:

  • doesn't specify an actual implementation (the C standard provides just a sample implementation), so any program that is intended to produce reproducible output (or expect a PRNG of some known quality) across different compilers must roll its own generator;
  • doesn't provide any cross-platform method to obtain a decent seed (time(NULL) is not, as it isn't granular enough, and often - think embedded devices with no RTC - not even random enough).

Hence the new <random> header, which tries to fix this mess providing algorithms that are:

  • fully specified (so you can have cross-compiler reproducible output and guaranteed characteristics - say, range of the generator);
  • generally of state-of-the-art quality (from when the library was designed; see below);
  • encapsulated in classes (so no global state is forced upon you, which avoids completely threading and nonlocality problems);

... and a default random_device as well to seed them.

Now, if you ask me I would have liked also a simple API built on top of this for the "easy", "guess a number" cases (similar to how Python does provide the "complicated" API, but also the trivial random.randint & Co. using a global, pre-seeded PRNG for us uncomplicated people who'd like not to drown in random devices/engines/adapters/whatever every time we want to extract a number for the bingo cards), but it's true that you can easily build it by yourself over the current facilities (while building the "full" API over a simplistic one wouldn't be possible).


Finally, to get back to your performance comparison: as others have specified, you are comparing a fast LCG with a slower (but generally considered better quality) Mersenne Twister; if you are ok with the quality of an LCG, you can use std::minstd_rand instead of std::mt19937.

Indeed, after tweaking your function to use std::minstd_rand and avoid useless static variables for initialization

int getRandNum_New() {
static std::minstd_rand eng{std::random_device{}()};
static std::uniform_int_distribution<int> dist{0, 5};
return dist(eng);
}

I get 9 ms (old) vs 21 ms (new); finally, if I get rid of dist (which, compared to the classic modulo operator, handles the distribution skew for output range not multiple of the input range) and get back to what you are doing in getRandNum_Old()

int getRandNum_New() {
static std::minstd_rand eng{std::random_device{}()};
return eng() % 6;
}

I get it down to 6 ms (so, 30% faster), probably because, unlike the call to rand(), std::minstd_rand is easier to inline.


Incidentally, I did the same test using a hand-rolled (but pretty much conforming to the standard library interface) XorShift64*, and it's 2.3 times faster than rand() (3.68 ms vs 8.61 ms); given that, unlike the Mersenne Twister and the various provided LCGs, it passes the current randomness test suites with flying colors and it's blazingly fast, it makes you wonder why it isn't included in the standard library yet.

C++11: How to set seed using random

The point of having a seed_seq is to increase the entropy of the generated sequence. If you have a random_device on your system, initializing with multiple numbers from that random device may arguably do that. On a system that has a pseudo-random number generator I don't think there is an increase in randomness, i.e. generated sequence entropy.

Building on that your approach:

If your system does provide a random device then you can use it like this:

  std::random_device r;
// std::seed_seq ssq{r()};
// and then passing it to the engine does the same
default_random_engine eng{r()};
uniform_real_distribution<double> urd(0, 1);
cout << "Uniform [0, 1): " << urd(eng);

If your system does not have a random device then you can use time(0) as a seed to the random_engine

  default_random_engine eng{static_cast<long unsigned int>(time(0))};
uniform_real_distribution<double> urd(0, 1);
cout << "Uniform [0, 1): " << urd(eng);

If you have multiple sources of randomness you can actually do this (e.g. 2)

std::seed_seq seed{ r1(), r2() };
default_random_engine eng{seed};
uniform_real_distribution<double> urd(0, 1);
cout << "Uniform [0, 1): " << urd(eng);

where r1 , r2 are different random devices , e.g. a thermal noise or quantum source .

Ofcourse you could mix and match

std::seed_seq seed{ r1(), static_cast<long unsigned int>(time(0)) };
default_random_engine eng{seed};
uniform_real_distribution<double> urd(0, 1);
cout << "Uniform [0, 1): " << urd(eng);

Finally, I like to initialize with an one liner:

  auto rand = std::bind(std::uniform_real_distribution<double>{0,1},
std::default_random_engine{std::random_device()()});
std::cout << "Uniform [0,1): " << rand();

If you worry about the time(0) having second precision you can overcome this by playing with the high_resolution_clock either by requesting the time since epoch as designated firstly by bames23 below:

static_cast<long unsigned int>(std::chrono::high_resolution_clock::now().time_since_epoch().count()) 

or maybe just play with CPU randomness

long unsigned int getseed(int const K)
{

typedef std::chrono::high_resolution_clock hiclock;

auto gett= [](std::chrono::time_point<hiclock> t0)
{
auto tn = hiclock::now();
return static_cast<long unsigned int>(std::chrono::duration_cast<std::chrono::microseconds>(tn-t0).count());
};

long unsigned int diffs[10];
diffs[0] = gett(hiclock::now());
for(int i=1; i!=10; i++)
{
auto last = hiclock::now();
for(int k=K; k!=0; k--)
{
diffs[i]= gett(last);
}
}

return *std::max_element(&diffs[1],&diffs[9]);
}

Consistent pseudo-random numbers across platforms

Something like a Mersenne Twister (from Boost.Random) is deterministic.

How do you generate a random double uniformly distributed between 0 and 1 from C++?

In C++11 and C++14 we have much better options with the random header. The presentation rand() Considered Harmful by Stephan T. Lavavej explains why we should eschew the use of rand() in C++ in favor of the random header and N3924: Discouraging rand() in C++14 further reinforces this point.

The example below is a modified version of the sample code on the cppreference site and uses the std::mersenne_twister_engine engine and the std::uniform_real_distribution which generates numbers in the [0,1) range (see it live):

#include <iostream>
#include <iomanip>
#include <map>
#include <random>

int main()
{
std::random_device rd;

std::mt19937 e2(rd());

std::uniform_real_distribution<> dist(0, 1);

std::map<int, int> hist;
for (int n = 0; n < 10000; ++n) {
++hist[std::round(dist(e2))];
}

for (auto p : hist) {
std::cout << std::fixed << std::setprecision(1) << std::setw(2)
<< p.first << ' ' << std::string(p.second/200, '*') << '\n';
}
}

output will be similar to the following:

0 ************************
1 *************************

Since the post mentioned that speed was important then we should consider the cppreference section that describes the different random number engines (emphasis mine):

The choice of which engine to use involves a number of tradeoffs*: the
**linear congruential engine is moderately fast
and has a very small
storage requirement for state. The lagged Fibonacci generators are
very fast even on processors without advanced arithmetic instruction

sets, at the expense of greater state storage and sometimes less
desirable spectral characteristics. The Mersenne twister is slower and
has greater state storage requirements
but with the right parameters
has the longest non-repeating sequence with the most desirable
spectral characteristics (for a given definition of desirable).

So if there is a desire for a faster generator perhaps ranlux24_base or ranlux48_base are better choices over mt19937.

rand()

If you forced to use rand() then the C FAQ for a guide on How can I generate floating-point random numbers?, gives us an example similar to this for generating an on the interval [0,1):

#include <stdlib.h>

double randZeroToOne()
{
return rand() / (RAND_MAX + 1.);
}

and to generate a random number in the range from [M,N):

double randMToN(double M, double N)
{
return M + (rand() / ( RAND_MAX / (N-M) ) ) ;
}

RcppArmadillo gamma distribution differs between platforms with same seed

Summarizing the discussion in the comments:

  • For the gamma distribution, Armadillo uses std::gamma_distribution from C++11's random header, c.f. https://gitlab.com/conradsnicta/armadillo-code/blob/9.300.x/include/armadillo_bits/arma_rng_cxx11.hpp#L165
  • The algorithms for producing the standard random number distributions in C++ are implementation-defined.
  • If you need cross-platform reproducible results the easiest solution is using the gamma distribution implemented in R via R::rgamma or Rcpp::rgamma.


Related Topics



Leave a reply



Submit