How to Succinctly, Portably, and Thoroughly Seed the Mt19937 Prng

If we seed c++11 mt19937 as the same on different machines, will we get the same sequence of random numbers

The generator will generate the same values.

The distributions may not, at least with different compilers or library versions. The standard did not specify their behaviour to that level of detail. If you want stability between compilers and library versions, you have to roll your own distribution.

Barring library/compiler changes, that will return the same values in the same sequence. But if you care write your own distribution.

...

All PRNGs have patterns and periods. mt19937 is named after its period of 2^19937-1, which is unlikely to be a problem. But other patterns can develop. MT PRNGs are robust against many statistical tests, but they are not crytographically secure PRNGs.

So it being a problem if you run for months will depend on specific details of what you'd find to be a problem. However, mt19937 is going to be a better PRNG than anything you are likely to write yourself. But assume attackers can predict its future behaviour from past evidence.

Does std::mt19937 require warmup?

Mersenne Twister is a shift-register based pRNG (pseudo-random number generator) and is therefore subject to bad seeds with long runs of 0s or 1s that lead to relatively predictable results until the internal state is mixed up enough.

However the constructor which takes a single value uses a complicated function on that seed value which is designed to minimize the likelihood of producing such 'bad' states. There's a second way to initialize mt19937 where you directly set the internal state, via an object conforming to the SeedSequence concept. It's this second method of initialization where you may need to be concerned about choosing a 'good' state or doing warmup.


The standard includes an object conforming to the SeedSequence concept, called seed_seq. seed_seq takes an arbitrary number of input seed values, and then performs certain operations on these values in order to produce a sequence of different values suitable for directly setting the internal state of a pRNG.

Here's an example of loading up a seed sequence with enough random data to fill the entire std::mt19937 state:

std::array<int, 624> seed_data;
std::random_device r;
std::generate_n(seed_data.data(), seed_data.size(), std::ref(r));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));

std::mt19937 eng(seq);

This ensures that the entire state is randomized. Also, each engine specifies how much data it reads from the seed_sequence so you may want to read the docs to find that info for whatever engine you use.

Although here I load up the seed_seq entirely from std::random_device, seed_seq is specified such that just a few numbers that aren't particularly random should work well. For example:

std::seed_seq seq{1, 2, 3, 4, 5};
std::mt19937 eng(seq);

In the comments below Cubbi indicates that seed_seq works by performing a warmup sequence for you.

Here's what should be your 'default' for seeding:

std::random_device r;
std::seed_seq seed{r(), r(), r(), r(), r(), r(), r(), r()};
std::mt19937 rng(seed);

Correctly seeding random number generator (Mersenne twister) c++

The Random object is essentially state information that you need to preserve. You can use all the normal techniques: You could have it as a global variable or pass it around as a parameter. If a particular class needs random numbers you can keep a Random object as a class member to provide randomness for that class.


The C++ <random> library is similar in that it requires the construction of an object as the source of randomness/RNG state. This is a good design because it allows the program to control access to the state and, for example, guarantee good behavior with multiple threads. The C++ <random> library even includes mersenne twister algorithm.

Here's an example showing saving a RNG state as a class member (using std::mt19937 instead of Random)

#include <random> // for mt19937
#include <algorithm> // for std::shuffle
#include <vector>

struct Deck {
std::vector<Cards> m_cards;
std::mt19937 eng; // save RNG state as class member so we don't have to keep creating one

void shuffle() {
std::shuffle(std::begin(m_cards), std::end(m_cards), eng);
}
};

int main() {
Deck d;
d.shuffle();
d.shuffle(); // this reuses the RNG state as it was at the end of the first shuffle, no reseeding
}

What is the benefit of using static thread_local with std::mt19937?

If it wasn't thread_local, then calling get_rand from multiple threads would cause a data race and therefore undefined behavior.

Although the static initialization is always safe, even with multiple threads calling it, the call to the generator in distribution(generator), which modifies the generator's internal state, is not.

With thread_local each thread has its own generator and so there is no issue with them calling the function unsynchronized.


Also note that time(0) is a bad idea. General issue with that as a seed aside, if multiple threads call the function at a similar time, they are likely to be seeded with the same value and then the random numbers in the threads will all be identical (which is very likely not what you want).

A somewhat better seeding would be

thread_local std::mt19937 generator(std::random_device{}());

instead, assuming your platform implements std::random_device as a proper source of randomness (which should generally be the case, but e.g. is not on some versions of MinGW). (Be careful though if your platform does not properly implement std::random_device. In that case it might always produce the same sequence of numbers.) For more details on seeding the standard library random number generators see e.g. this question. Also note that even with platform support, this is still a mediocre seeding since it (typically) uses only 32bits of entropy. See e.g. this blog post for some details about these issues. Correct seeding is a quite non-trivial problem.

For a declaration at block scope static is redundant by the way when using thread_local.



Related Topics



Leave a reply



Submit