Vary Range of Uniform_Int_Distribution

Vary range of uniform_int_distribution

Distribution objects are lightweight. Simply construct a new distribution when you need a random number. I use this approach in a game engine, and, after benchmarking, it's comparable to using good old rand().

Also, I've asked how to vary the range of distribution on GoingNative 2013 live stream, and Stephen T. Lavavej, a member of the standard committee, suggested to simply create new distributions, as it shouldn't be a performance issue.

Here's how I would write your code:

using uint32 = unsigned int;

class Random {
public:
    Random() = default;
    Random(std::mt19937::result_type seed) : eng(seed) {}
    uint32 DrawNumber(uint32 min, uint32 max);

private:        
    std::mt19937 eng{std::random_device{}()};
};

uint32 Random::DrawNumber(uint32 min, uint32 max)
{
    return std::uniform_int_distribution<uint32>{min, max}(eng);
}

Use std::uniform_int_distribution and define its range later

You can just do

group.dis = std::uniform_int_distribution<>(0,19);

group.dis.param(std::uniform_int_distribution<>::param_type(0,19));

Another way would be to add a method to your struct

struct group_s {
    int k;
    std::uniform_int_distribution<> dis;
    void set(int a, int b) { dis = std::uniform_int_distribution<>(a,b); }
} group;

group.set(0,19);

C++11 Generating random numbers from frequently changing range

I would do

int getIntFromRange1(int from, int to){
    std::uniform_int_distribution<int> dist(from, to);
    return dist(mt);
}

Distributions and internal state

Interesting question.

So I was wondering if interfering with how the distribution works by
constantly resetting it (i.e. recreating the distribution at every
call of get_int_from_range) I get properly distributed results.

I've written code to test this with uniform_int_distribution and poisson_distribution. It's easy enough to extend this to test another distribution if you wish. The answer seems to be yes.

Boiler-plate code:

#include <random>
#include <memory>
#include <chrono>
#include <utility>

typedef std::mt19937_64 engine_type;

inline size_t get_seed()
    { return std::chrono::system_clock::now().time_since_epoch().count(); }

engine_type& engine_singleton()
{  
    static std::unique_ptr<engine_type> ptr;

    if ( !ptr ) 
        ptr.reset( new engine_type(get_seed()) );
    return *ptr;
}

// ------------------------------------------------------------------------

#include <cmath>
#include <cstdio>
#include <vector>
#include <string>
#include <algorithm>

void plot_distribution( const std::vector<double>& D, size_t mass = 200 )
{
    const size_t n = D.size();
    for ( size_t i = 0; i < n; ++i ) 
    {
        printf("%02ld: %s\n", i, 
            std::string(static_cast<size_t>(D[i]*mass),'*').c_str() );
    }
}

double maximum_difference( const std::vector<double>& x, const std::vector<double>& y )
{
    const size_t n = x.size(); 

    double m = 0.0;
    for ( size_t i = 0; i < n; ++i )
        m = std::max( m, std::abs(x[i]-y[i]) );

    return m;
}

Code for the actual tests:

#include <iostream>
#include <vector>
#include <cstdio>
#include <random>
#include <string>
#include <cmath>

void compare_uniform_distributions( int lo, int hi )
{
    const size_t sample_size = 1e5;

    // Initialize histograms
    std::vector<double> H1( hi-lo+1, 0.0 ), H2( hi-lo+1, 0.0 );

    // Initialize distribution
    auto U = std::uniform_int_distribution<int>(lo,hi);

    // Count!
    for ( size_t i = 0; i < sample_size; ++i )
    {
        engine_type E(get_seed());

        H1[ U(engine_singleton())-lo ] += 1.0;
        H2[ U(E)-lo ] += 1.0;
    }

    // Normalize histograms to obtain "densities"
    for ( size_t i = 0; i < H1.size(); ++i )
    {
        H1[i] /= sample_size; 
        H2[i] /= sample_size; 
    }

    printf("Engine singleton:\n"); plot_distribution(H1);
    printf("Engine creation :\n"); plot_distribution(H2);
    printf("Maximum difference: %.3f\n", maximum_difference(H1,H2) );
    std::cout<< std::string(50,'-') << std::endl << std::endl;
}

void compare_poisson_distributions( double mean )
{
    const size_t sample_size = 1e5;
    const size_t nbins = static_cast<size_t>(std::ceil(2*mean));

    // Initialize histograms
    std::vector<double> H1( nbins, 0.0 ), H2( nbins, 0.0 );

    // Initialize distribution
    auto U = std::poisson_distribution<int>(mean);

    // Count!
    for ( size_t i = 0; i < sample_size; ++i )
    {
        engine_type E(get_seed());
        int u1 = U(engine_singleton());
        int u2 = U(E);

        if (u1 < nbins) H1[u1] += 1.0;
        if (u2 < nbins) H2[u2] += 1.0;
    }

    // Normalize histograms to obtain "densities"
    for ( size_t i = 0; i < H1.size(); ++i )
    {
        H1[i] /= sample_size; 
        H2[i] /= sample_size; 
    }

    printf("Engine singleton:\n"); plot_distribution(H1);
    printf("Engine creation :\n"); plot_distribution(H2);
    printf("Maximum difference: %.3f\n", maximum_difference(H1,H2) );
    std::cout<< std::string(50,'-') << std::endl << std::endl;

}

// ------------------------------------------------------------------------

int main()
{
    compare_uniform_distributions( 0, 25 );
    compare_poisson_distributions( 12 );
}

Run it here.

Does the C++ standard make any guarantee regarding this topic?

Not that I know of. However, I would say that the standard makes an implicit recommendation not to re-create the engine every time; for any distribution Distrib, the prototype of Distrib::operator() takes a reference URNG& and not a const reference. This is understandably required because the engine might need to update its internal state, but it also implies that code looking like this

auto U = std::uniform_int_distribution(0,10);
for ( <something here> ) U(engine_type());

does not compile, which to me is a clear incentive not to write code like this.

I'm sure there are plenty of advice out there on how to properly use the random library. It does get complicated if you have to handle the possibility of using random_devices and allowing deterministic seeding for testing purposes, but I thought it might be useful to throw my own recommendation out there too:

#include <random>
#include <chrono>
#include <utility>
#include <functional>

inline size_t get_seed()
    { return std::chrono::system_clock::now().time_since_epoch().count(); }

template <class Distrib>
using generator_type = std::function< typename Distrib::result_type () >;

template <class Distrib, class Engine = std::mt19937_64, class... Args>
inline generator_type<Distrib> get_generator( Args&&... args )
{ 
    return std::bind( Distrib( std::forward<Args>(args)... ), Engine(get_seed()) ); 
}

// ------------------------------------------------------------------------

#include <iostream>

int main()
{
    auto U = get_generator<std::uniform_int_distribution<int>>(0,10);
    std::cout<< U() << std::endl;
}

Run it here. Hope this helps!

EDIT My first recommendation was a mistake, and I apologise for that; we can't use a singleton engine like in the tests above, because this would mean that two uniform int distributions would produce the same random sequence. Instead I rely on the fact that std::bind copies the newly-created engine locally in std::function with its own seed, and this yields the expected behaviour; different generators with the same distribution produce different random sequences.

bernoulli_distribution vs uniform_int_distribution

Some comments and answers suggest using uniform_real_distribution instead.

I tested uniform_real_distribution(0.0f, nextafter(1.0f, 20.f)) (to account for urd being a half-closed range) vs bernoulli_distribution and the bernoulli_distribution is faster by about 20%-25% regardless of the probability (and gave more correct results. I tested 1.0 true probability and my implementation that used the above urd values actually gave false negatives (granted one or two out of 5 one-million runs) and bernoulli gave the correct none.

So, speed-wise: bernoulli_distribution is faster than uniform_real_distribution but slower than uniform_int_distribution.

Long-story short, use the right tool for the job, don't reinvent the wheel, the STL is well-built, etc. and depending on the use-case one is better than the other.

For yes-no probability (IsPercentChance(float probability)), bernoulli_distribution is faster and better.

For pure "give me a random random bool value", uniform_int_distribution is faster and better.

Vary Range of Uniform_Int_Distribution