<Random> Generates Same Number in Linux, But Not in Windows

random generates same number in Linux, but not in Windows

Here's what's going on:

  • default_random_engine in libstdc++ (GCC's standard library) is minstd_rand0, which is a simple linear congruential engine:

    typedef linear_congruential_engine<uint_fast32_t, 16807, 0, 2147483647> minstd_rand0;
  • The way this engine generates random numbers is xi+1 = (16807xi + 0) mod 2147483647.

  • Therefore, if the seeds are different by 1, then most of the time the first generated number will differ by 16807.

  • The range of this generator is [1, 2147483646]. The way libstdc++'s uniform_int_distribution maps it to an integer in the range [1, 100] is essentially this: generate a number n. If the number is not greater than 2147483600, then return (n - 1) / 21474836 + 1; otherwise, try again with a new number.

    It should be easy to see that in the vast majority of cases, two ns that differ by only 16807 will yield the same number in [1, 100] under this procedure. In fact, one would expect the generated number to increase by one about every 21474836 / 16807 = 1278 seconds or 21.3 minutes, which agrees pretty well with your observations.

MSVC's default_random_engine is mt19937, which doesn't have this problem.

different generated random numbers on Windows and Linux for a specific PRNG

It is a bug in randtoolbox: the C code doesn't use width-base integer types like uint32_t and has some subtle bugs, when int and long differ in width (like on Linux 64 bits, but not on Windows). For example in the file mt19937ar.c an int is assigned to an unsigned long, which causes sign extension:

static unsigned long mt[N]; /* the array for the state vector  */
...
void putMersenneTwister(int *init, int *res, int *state)
{
...
for (i=0; i<N; i++)
mt[i] = state[i+1];

If you replace the last line with:

        mt[i] = state[i+1] & 0xffffffffUL;

the bug disappears.

Just download the source code, extract it, patch it and execute:

R CMD INSTALL randtoolbox

in the package parent directory.

Unix Shell - Why are the same $RANDOM numbers repeated?

This is due to a zsh bug / "behaviour" for RANDOM in subshells. This bug doesn't appear in bash.

echo $RANDOM # changes at every run  
echo `echo $RANDOM` # always return the same value until you call the first line

Because RANDOM is seeded by its last value, but in a subshell the value obtained is not updated in the main shell.

In man zshparam:

RANDOM <S>
A pseudo-random integer from 0 to 32767, newly generated each
time this parameter is referenced. The random number generator
can be seeded by assigning a numeric value to RANDOM.

The values of RANDOM form an intentionally-repeatable
pseudo-random sequence; subshells that reference RANDOM will
result in identical pseudo-random values unless the value of
RANDOM is referenced or seeded in the parent shell in between
subshell invocations.

There is even crazier because calling uniq creates a subshell

for i in {1..10}; do echo $RANDOM; done # changes at every run 
for i in {1..10}; do echo $RANDOM; done | uniq # always the same 10 numbers

Source : Debian bug report 828180

how to generate the same random number in two different environments?

You can use a the mersenne twister it has reproducable output (it is standardized).
Use the same seed on 2 machines and you're good to go.

#include <random>
#include <iostream>

int main()
{
std::mt19937 engine;

engine.seed(1);

for (std::size_t n = 0; n < 10; ++n)
{
std::cout << engine() << std::endl;
}
}

You can verify it here, https://godbolt.org/z/j5r6ToGY7, just select different compilers and check the output

std::random - On Windows always I get the same numbers

The problem here is that the std::random_device object (which you are using to seed your std::mt19937) may produce the same seed, each time (although it doesn't on my Windows 10 + Visual Studio test platform).

From cppreference (bolding mine):

std::random_device may be implemented in terms of an
implementation-defined pseudo-random number engine if a
non-deterministic source (e.g. a hardware device) is not available to
the implementation. In this case each std::random_device object may
generate the same number sequence
.

Here's a possible solution using the 'classic' call to time(nullptr) as the seed, which also avoids the use of the 'intermediate' std::random_device object to generate that seed (though there will be other options to get that seed):

#include <iostream>
#include <random>
#include <ctime>

int main()
{
std::mt19937 gen(static_cast<unsigned int>(time(nullptr)));
std::uniform_int_distribution<> distribut(1, 6);
for (int i = 0; i < 10; i++) {
std::cout << distribut(gen) << ' ';
}
return 0;
}

random.random() generates same number in multiprocessing

You need to use shared memory if you want to share variables across processes. This is because child processes do not share their memory space with the parent. Simplest way to do this here would be to use managed lists and delete the line where you set a number seed. This is what is causing same number to be generated because all child processes will take the same seed to generate the random numbers. To get different random numbers either don't set a seed, or pass a different seed to each process:

import time, random
from multiprocessing import Manager, Process

def foo(epoch, lst):
extra = random.random()
lst.append(epoch + extra)

def optimization(loop_time, iter_time, lst):
start = time.time()
epoch = 0
while time.time() <= start + loop_time:
proc = Process(target=foo, args=(epoch, lst))
proc.start()
proc.join(iter_time)
if proc.is_alive(): # if the process is not terminated within time limit
print("Time out!")
proc.terminate()
print(lst)

if __name__ == '__main__':
manager = Manager()
lst = manager.list()
optimization(10, 2, lst)

Output

[0.2035898948744943, 0.07617925389396074, 0.6416754412198231, 0.6712193790613651, 0.419777147554235, 0.732982735576982, 0.7137712131028766, 0.22875414425414997, 0.3181113880578589, 0.5613367673646847, 0.8699685474084119, 0.9005359611195111, 0.23695341111251134, 0.05994288664062197, 0.2306562314450149, 0.15575356275408125, 0.07435292814989103, 0.8542361251850187, 0.13139055891993145, 0.5015152768477814, 0.19864873743952582, 0.2313646288041601, 0.28992667535697736, 0.6265055915510219, 0.7265797043535446, 0.9202923318284002, 0.6321511834038631, 0.6728367262605407, 0.6586979597202935, 0.1309226720786667, 0.563889613032526, 0.389358766191921, 0.37260564565714316, 0.24684684162272597, 0.5982042933298861, 0.896663326233504, 0.7884030244369596, 0.6202229004466849, 0.4417549843477827, 0.37304274232635715, 0.5442716244427301, 0.9915536257041505, 0.46278512685707873, 0.4868394190894778, 0.2133187095154937]

Keep in mind that using managers will affect performance of your code. Alternate to this, you could also use multiprocessing.Array, which is faster than managers but is less flexible in what data it can store, or Queues as well.

Same seed, different OS, different random numbers in R

From docs:

Random docs:

RNGversion can be used to set the random generators as they were in an earlier R version (for reproducibility).

So try this on all systems:

set.seed(10, kind = "Mersenne-Twister", normal.kind = "Inversion"); rnorm(1)
[1] 0.01874617

Pseudo random number generator gives same first output but then behaves as expected

I'm not sure what's going wrong (yet!), but you can still initialize by time as follows without hitting the problem (borrowed from here).

#include <ctime>
#include <iostream>
#include <random>
#include <chrono>

using namespace std;

unsigned seed1 = std::chrono::system_clock::now().time_since_epoch().count();

default_random_engine gen(seed1); //gen(time(NULL));
uniform_int_distribution<int> dist(10,200);

int main()
{
for(int i = 0; i < 5; i++)
cout<<dist(gen)<<endl;

return 0;
}

You can also use the random device, which is non-determinstic (it steals timing information from your key strokes, mouse movements, and other sources to generate unpredictable numbers). This is the strongest seed you can choose but the computer's clock is the better way to go if you don't need strong guarantees because the computer can run out of "randomness" if you use it to often (it takes many key strokes and mouse movements to generate a single truly random number).

std::random_device rd;
default_random_engine gen(rd());

Running

cout<<time(NULL)<<endl;
cout<<std::chrono::system_clock::now().time_since_epoch().count()<<endl;
cout<<rd()<<endl;

on my machine generates

1413844318
1413844318131372773
3523898368

so the chrono library is providing a significantly larger number and more rapidly changing number (that's in nanoseconds) than the ctime library. The random_device is producing non-deterministic numbers which are all over the map. So it seems as though the seeds ctime is producing may be too close together somehow and thus map partly to the same internal state?

I made another program which looks like this:

#include <iostream>
#include <random>
using namespace std;

int main(){
int oldval = -1;
unsigned int oldseed = -1;

cout<<"Seed\tValue\tSeed Difference"<<endl;
for(unsigned int seed=0;seed<time(NULL);seed++){
default_random_engine gen(seed);
uniform_int_distribution<int> dist(10,200);
int val = dist(gen);
if(val!=oldval){
cout<<seed<<"\t"<<val<<"\t"<<(seed-oldseed)<<endl;
oldval = val;
oldseed = seed;
}
}
}

As you can see, this simply prints out the first output value for every possible random seed up to the current time along with the seed and number of previous seeds which had the same value. An excerpt of the output looks like this:

Seed  Value Seed Difference
0 10 1
669 11 669
1338 12 669
2007 13 669
2676 14 669
3345 15 669
4014 16 669
4683 17 669
5352 18 669
6021 19 669
6690 20 669
7359 21 669
8028 22 669
8697 23 669
9366 24 669
10035 25 669
10704 26 669
11373 27 669
12042 28 669
12711 29 669
13380 30 669
14049 31 669

So for every new first number there are 669 seeds which give that first number. Because the second and third numbers are different we are still generating unique internal states. I think we would have to understand much more about the default_random_engine in order to understand what is special about 669 (which can be factored into 3 and 223).

Given this, it's clear why the chrono and random_device libraries work better: the seeds they generate are always more than 669 apart. Keep in mind that even if the first number is the same what matters in many programs is that the sequence of numbers generated by distinct.



Related Topics



Leave a reply



Submit