How to Generate Thread-Safe Uniform Random Numbers

How do I generate thread-safe uniform random numbers?

Have you tried this?

int intRand(const int & min, const int & max) {
static thread_local std::mt19937 generator;
std::uniform_int_distribution<int> distribution(min,max);
return distribution(generator);
}

Distributions are extremely cheap (they will be completely inlined by the optimiser so that the only remaining overhead is the actual random number rescaling). Don’t be afraid to regenerate them as often as you need – in fact, resetting them would conceptually be no cheaper (which is why that operation doesn’t exist).

The actual random number generator, on the other hand, is a heavy-weight object carrying a lot of state and requiring quite some time to be constructed, so that should only be initialised once per thread (or even across threads, but then you’d need to synchronise access which is more costly in the long run).

C++ thread-safe uniform distribution random number generation

Based on the already mentioned solution, here is a version adapted to your specific needs:

double doubleRand(double min, double max) {
thread_local std::mt19937 generator(std::random_device{}());
std::uniform_real_distribution<double> distribution(min, max);
return distribution(generator);
}

c++ generate thread safe random numbers

I would probably generate the seeds in main, and pass a seed to each thread function. I wouldn't use the output of std::random_device directly either--I'd put numbers into something like an std::set or std::unordered_set until I got as many seeds as I wanted, to assure that I didn't give two threads the same seed (which would obviously be a waste of time).

Something along this general line:

int do_work(unsigned long long seed) {

//Get random number generators
typedef std::mt19937 MyRNG;

//seed generator
MyRNG rng(seed);

//make my uniform distributions for each parameter
std::uniform_real_distribution<> param1(-1,1);
std::uniform_real_distribution<> param2(-1,1);

double x,y;
//Do my scan
for (int i = 0; i < N; i++) {
x = param1(rng);
y = param2(rng);

//Do things with x and y*
}
}

static const int num_threads = 4;

int main() {
std::set<unsigned long long> seeds;

while (seeds.size() < num_threads)
seeds.insert(std::random_device()());

std::vector<std::thread> threads;

for (auto const seed: seeds)
threads.emplace_back(std::thread(do_work, seed));

for (auto &t : threads)
t.join();
}

As an aside, using a single result from random_device to seed an std::mt19937 restricts the generator quite a bit--you're giving it only 32 (or possibly 64) bits of seed, but it actually has 19937 bits of seed material. std::seed_seq attempts to ameliorate this to at least some degree (among other things, you can use a number of outputs from std::random_device to create the seed.

Oh, and given that your two instances of uniform_real_distribution use the same parameters, there's probably not a whole lot of need for two separate distribution objects either.

Safe random number generation using rand() or std::random_device in C++

Random number generator engines are not thread safe.

Using thread_local to initialize one engine per thread makes it thread safe, so I would say your solution is good.

If you have doubts about the thread safety of random_device, maybe make it thread_local also:

int randomInRange(const int min, const int max)
{
thread_local std::random_device rd;
thread_local std::mt19937 rng(rd());
thread_local std::uniform_int_distribution<int> uid;
return uid(rng, decltype(uid)::param_type{min,max});
}

There are more threads that talk about this that weren't mentioned:

C++ thread-safe uniform distribution random number generation

C++11 Thread safety of Random number generators

I'm sure that there are more.

Why isn't this random number generator thread-safe?

There are three observations about your test output I would make:

  • It has much stronger variance than a good random source's average should provide. You observed this yourself by comparing to the single thread results.

  • The calculated average decreases with thread count and never reaches the original 0.5 (i.e. it's not just higher variance but also changed mean).

  • There is a temporal relation in the data, particularly visible in the 4 thread case.

All this is explained by the race condition present in your code: You assign to SUM from multiple threads. Incrementing a double is not an atomic operation (even on x86, where you'll probably get atomic reads and writes on registers). Two threads may read the current value (e.g. 10), increment it (e.g. both add 0.5) and then write the value back to memory. Now you have two threads writing a 10.5 instead of the correct 11.

The more threads try to write to SUM concurrently (without synchronization), the more of their changes are lost. This explains all observations:

  • How hard the threads race each other in each individual run determines how many results are lost, and this can vary from run to run.

  • The average is lower with more races (for example more threads) because more results are lost. You can never exceeed the statistical 0.5 average because you only ever lose writes.

  • As the threads and scheduler "settle in", the variance is reduced. This is a similar reason to why you should "warm up" your tests when benchmarking.

Needless to say, this is undefined behavior. It just shows benign behavior on x86 CPUs, but this is not something the C++ standard guarantees you. For all you know, the individual bytes of a double might be written to by different threads at the same time resulting in complete garbage.

The proper solution would be adding the doubles thread-locally and then (with synchronization) adding them together in the end. OMP has reduction clauses for this specific purpose.

For integral types, you could use std::atomic<IntegralType>::fetch_add(). std::atomic<double> exists but (before C++20) the mentioned function (and others) are only available for integral types.

Thread-safe uniform random number generator

A good Pseudorandom number generator for Fortran90 can be found in the Intel Math Kernel Vector Statistical Library. They are thread safe. Also, why does it need to be threadsafe? If you want each thread to get the same list, instantiate a new PRNG for each thread with the same seed.

Thread-safe using of random

You're never meant to use a PRNG for long enough that you can see a perfectly uniform distribution from it. You should only see a random approximation of a uniform distribution. The fact that giving every thread its own PRNG means it would take numThreads times longer for every thread to see that perfect distribution does not matter.

PRNGs are subject to mathematical analysis to confirm that they produce each possible output the same number of times, but this does not reflect how they're meant to be used.

If they were used this way it would represent a weakness. If you watch it for long enough you know that having seen output x n times, and every other output n+1 times, the next output must be x. This is uniform but it is obviously not random.

A true RNG will never produce a perfectly uniform output, but it should also never show the same bias twice. A PRNG known to have non-uniform output will have the same non-uniform output every time it's used. This means that if you run simulations for longer to average out the noise, the PRNG's own bias will eventually become the most statistically significant factor. So an ideal PRNG should eventually emit a perfectly uniform distribution (actually often not perfect, but within a known, very small margin) over a sufficiently long period.

Your random seed will pick a random point somewhere in that sequence and your random number requests will proceed some way along that sequence, but if you ever find yourself going all the way around and coming back to your own start point (or even within 1/10th that distance) then you need to get a bigger PRNG.

That said, there is another consideration. Often PRNGs are used specifically so that their results will be predictable.

If you're in a situation where you need a mutex to safely access a single PRNG, then you've probably already lost your predictable behaviour because you can't tell which thread got the next [predictable] random number. Each thread still sees an unpredictable sequence because you can't be sure which subset of PRNG results it got.

If every thread has its own PRNG, then (so long as other ordering constraints are met as necessary) you can still get predictable results.



Related Topics



Leave a reply



Submit