C++11 Random Number Distributions Are Not Consistent Across Platforms -- What Alternatives Are There

C++11 random number distributions are not consistent across platforms -- what alternatives are there?

I have created my own C++11 distributions:

template <typename T>
class UniformRealDistribution
{
public:
typedef T result_type;

public:
UniformRealDistribution(T _a = 0.0, T _b = 1.0)
:m_a(_a),
m_b(_b)
{}

void reset() {}

template <class Generator>
T operator()(Generator &_g)
{
double dScale = (m_b - m_a) / ((T)(_g.max() - _g.min()) + (T)1);
return (_g() - _g.min()) * dScale + m_a;
}

T a() const {return m_a;}
T b() const {return m_b;}

protected:
T m_a;
T m_b;
};

template <typename T>
class NormalDistribution
{
public:
typedef T result_type;

public:
NormalDistribution(T _mean = 0.0, T _stddev = 1.0)
:m_mean(_mean),
m_stddev(_stddev)
{}

void reset()
{
m_distU1.reset();
}

template <class Generator>
T operator()(Generator &_g)
{
// Use Box-Muller algorithm
const double pi = 3.14159265358979323846264338327950288419716939937511;
double u1 = m_distU1(_g);
double u2 = m_distU1(_g);
double r = sqrt(-2.0 * log(u1));
return m_mean + m_stddev * r * sin(2.0 * pi * u2);
}

T mean() const {return m_mean;}
T stddev() const {return m_stddev;}

protected:
T m_mean;
T m_stddev;
UniformRealDistribution<T> m_distU1;
};

The uniform distribution seems to deliver good results and the normal distribution delivers very good results:

100000 values -> 68.159% within 1 sigma; 95.437% within 2 sigma; 99.747% within 3 sigma

The normal distribution uses the Box-Muller method, which according to what I have read so far, is not the fastest method, but it runs more that fast enough for my application.

Both the uniform and normal distributions should work with any C++11 engine (tested with std::mt19937) and provides the same sequence on all platforms, which is exactly what I wanted.

c++ normal_distribution gives different results on different platforms

std::mt19937 and siblings are very specific algorithms. The standard requires that e.g. The 10000th consecutive invocation of a default-constructed object of type mt19937 shall produce the value 4123659995. There's no wiggle room here.

std::normal_distribution and siblings, by contrast, are only required to produce results distributed according to a certain probability density function. There is no requirement for them to be any specific function.

Consistent pseudo-random numbers across platforms

Something like a Mersenne Twister (from Boost.Random) is deterministic.

Why does std::uniform_int_distribution not give the same answer across different systems?

The algorithm for distributions is not specified, different implementations may produce different results.

(The only thing the standard does specify are the raw random bit engines; those produce predictable results.)

Can I use a random generator without specifying a number distribution?

std::mt19937 produces uniform random numbers in the range [0, 232-1]. It implements the Mersenne Twister alrogithm and is guaranteed to provide reproducible results across implementations.

If you need a different range, you need to somehow reduce [0, 232-1] to your desired range. std::uniform_int_distribution is a convenience tool for doing that (but provides no guarantee of portability across implementations).

Consistent random number generation accross platforms with boost::random

it seems boost::random does not guarantee that you get the same sequence of numbers for certain seeds with different versions boost.

E.g. in version 1.56 they have changed the algorithm to generate normal distributed random numbers from the Box-Muller method to the Ziggurat method:

https://github.com/boostorg/random/commit/f0ec97ba36c05ef00f2d29dcf66094e3f4abdcde

This method is faster but also produces different number sequences.

Similar changes have probably been done to the other distributions. The uniform distribution still produces the same result as that is typically the output of the base rng which is a mersenne twister 19937 by default.

Is generate_canonical output consistent across platforms?

The difficulties encountered in the linked question points to the basic problem with consistency: rounding mode. The clear intent of the mathematical definition of generate_canonical in the standard is that the URNG be called several times, each producing a non-overlapping block of entropy to fill the result with; that would be entirely consistent across platforms. The problem is, no indication is given as to what to do with the extra bits below the LSB. Depending on rounding mode and summation order, these can round upwards, spilling into the next block (which is what allows for a 1.0 result).

Now, the precise wording is "the instantiation’s results...are distributed as uniformly as possible as specified below". If the rounding mode is round-to-nearest, an implementation which produces 1.0 is not as uniform as possible (because 1-eps is less likely than 1-2*eps). But it's still "as specified below". So depending on how you parse that sentence, generate_canonical is either fully specified and consistent, or has delegated some extra un-discussed bits to the implementation.

In any case, the fact that certain implementations produce 1.0 makes it quite clear that the current behavior is not cross-platform consistent. If you want that, it seems like the most straightforward approach would be to wrap your URNG in an independent_bits_engine to produce some factor of bits bits, so there's never anything to round.

using one random engine for multi distributions in c++11

It's ok.

Reasons to not share the generator:

  • threading (standard RNG implementations are not thread safe)
  • determinism of random sequences:

    If you wish to be able (for testing/bug hunting) to control the exact sequences generated, you will by likely have fewer troubles by isolating the RNGs used, especially when not all RNGs consumption is deterministic.



Related Topics



Leave a reply



Submit