Does Std::Mt19937 Require Warmup

Does std::mt19937 require warmup?

Mersenne Twister is a shift-register based pRNG (pseudo-random number generator) and is therefore subject to bad seeds with long runs of 0s or 1s that lead to relatively predictable results until the internal state is mixed up enough.

However the constructor which takes a single value uses a complicated function on that seed value which is designed to minimize the likelihood of producing such 'bad' states. There's a second way to initialize mt19937 where you directly set the internal state, via an object conforming to the SeedSequence concept. It's this second method of initialization where you may need to be concerned about choosing a 'good' state or doing warmup.


The standard includes an object conforming to the SeedSequence concept, called seed_seq. seed_seq takes an arbitrary number of input seed values, and then performs certain operations on these values in order to produce a sequence of different values suitable for directly setting the internal state of a pRNG.

Here's an example of loading up a seed sequence with enough random data to fill the entire std::mt19937 state:

std::array<int, 624> seed_data;
std::random_device r;
std::generate_n(seed_data.data(), seed_data.size(), std::ref(r));
std::seed_seq seq(std::begin(seed_data), std::end(seed_data));

std::mt19937 eng(seq);

This ensures that the entire state is randomized. Also, each engine specifies how much data it reads from the seed_sequence so you may want to read the docs to find that info for whatever engine you use.

Although here I load up the seed_seq entirely from std::random_device, seed_seq is specified such that just a few numbers that aren't particularly random should work well. For example:

std::seed_seq seq{1, 2, 3, 4, 5};
std::mt19937 eng(seq);

In the comments below Cubbi indicates that seed_seq works by performing a warmup sequence for you.

Here's what should be your 'default' for seeding:

std::random_device r;
std::seed_seq seed{r(), r(), r(), r(), r(), r(), r(), r()};
std::mt19937 rng(seed);

Mersenne twister warm up vs. reproducibility

Two options:

  1. Follow the proposal you have, but instead of using std::random_device r; to generate your seed sequence for MT, use a different PRNG seeded with m. Choose one that doesn't suffer like MT does from needing a warmup when used with small seed data: I suspect an LCG will probably do. For massive overkill, you could even use a PRNG based on a secure hash. This is a lot like "key stretching" in cryptography, if you've heard of that. You could in fact use a standard key stretching algorithm, but you're using it to generate a long seed sequence rather than large key material.

  2. Continue using m to seed your MT, but discard a large constant amount of data before starting the simulation. That is to say, ignore the advice to use a strong seed and instead run the MT long enough for it to reach a decent internal state. I don't know off-hand how much data you need to discard, but I expect the internet does.

How to avoid same random results using std::mt19937?

This is because f1 is copied into generate_n() function invocation. Use std::ref(f1) instead and it will return different results for v0 and v1:

std::generate_n(std::back_inserter(v0), 3, std::ref(f1));
std::generate_n(std::back_inserter(v1), 3, std::ref(f1));

If we seed c++11 mt19937 as the same on different machines, will we get the same sequence of random numbers

The generator will generate the same values.

The distributions may not, at least with different compilers or library versions. The standard did not specify their behaviour to that level of detail. If you want stability between compilers and library versions, you have to roll your own distribution.

Barring library/compiler changes, that will return the same values in the same sequence. But if you care write your own distribution.

...

All PRNGs have patterns and periods. mt19937 is named after its period of 2^19937-1, which is unlikely to be a problem. But other patterns can develop. MT PRNGs are robust against many statistical tests, but they are not crytographically secure PRNGs.

So it being a problem if you run for months will depend on specific details of what you'd find to be a problem. However, mt19937 is going to be a better PRNG than anything you are likely to write yourself. But assume attackers can predict its future behaviour from past evidence.

Random integers from a function always return the same number - why?

You're creating a new std::random_device and std::mt19937 generator every time you call guessGetRandomIndex. Mark them as static or pass them to the function via reference to avoid re-instantiating them:

int guessGetRandomIndex() {
static std::random_device rd; // obtain a random number from hardware
static std::mt19937 eng(rd()); // seed the generator
std::uniform_int_distribution<> distr(0, 2);
return distr(eng);
}

In order to generate pseudorandom numbers, the generators provided by the standard library contain an internal status that varies every time they generate a value. If you keep re-instantiating a generator, the internal state will always match its initial one, thus always generating the same number.

Should I use a random engine seeded from std::random_device or use std::random_device every time

The standard practice, as far as I am aware, is to seed the random number generator with a number that is not calculated by the computer (but comes from some external, unpredictable source). That should be the case with your rd() function. From then on, you call the pseudo-random number generator(PRNG) for each and every pseudo-random number that you need.

If you are worried about the numbers not being random enough, then you should pick a different PRNG. Entropy is a scarce and precious resource and should be treated as such. Although, you may not be needing that many random numbers right now, you may in the future; or other applications could need them. You want that entropy to be available whenever an application asks for it.

It sounds like, for your application, that the mersenne twister will suit your needs just fine. No one who plays your game will ever feel like the teams that are loaded aren't random.

std::mt19937 and std::uniform_real_distribution returning boundary value every time

According to C++14 section 26.5.8.2.2 paragraph 2:

Requires: a ≤ b and b − a ≤ numeric_limits<RealType>::max().

In your case, b-a is greater than the allowed range.



Related Topics



Leave a reply



Submit