What Is the Optimal Algorithm for Generating an Unbiased Random Integer Within a Range

What is the optimal algorithm for generating an unbiased random integer within a range?

The problem occurs when the number of outputs from the random number generator (RAND_MAX+1) is not evenly divisible by the desired range (max-min+1). Since there will be a consistent mapping from a random number to an output, some outputs will be mapped to more random numbers than others. This is regardless of how the mapping is done - you can use modulo, division, conversion to floating point, whatever voodoo you can come up with, the basic problem remains.

The magnitude of the problem is very small, and undemanding applications can generally get away with ignoring it. The smaller the range and the larger RAND_MAX is, the less pronounced the effect will be.

I took your example program and tweaked it a bit. First I created a special version of rand that only has a range of 0-255, to better demonstrate the effect. I made a few tweaks to rangeRandomAlg2. Finally I changed the number of "balls" to 1000000 to improve the consistency. You can see the results here: http://ideone.com/4P4HY

Notice that the floating-point version produces two tightly grouped probabilities, near either 0.101 or 0.097, nothing in between. This is the bias in action.

I think calling this "Java's algorithm" is a bit misleading - I'm sure it's much older than Java.

int rangeRandomAlg2 (int min, int max)
{
    int n = max - min + 1;
    int remainder = RAND_MAX % n;
    int x;
    do
    {
        x = rand();
    } while (x >= RAND_MAX - remainder);
    return min + x % n;
}

How to generate an un-biased random number within an arbitrary range using the fewest bits

The common approach to eliminating bias is to throw away numbers that are outside of the desired range. As noted, this is wasteful. It's possible to minimize the waste by starting with a larger number of bits and generating multiple random numbers at the same time; you can achieve a better match between the range of inputs to outputs.

For example take a roll of a die. The output has 6 possibilities. The naive approach would take 3 random bits for each random number produced. The first example demonstrates the pigeon-hole problem.

def pigeon_die(total_bit_count):
    for i in xrange(total_bit_count // 3):
        bits = random.getrandbits(3)
        yield 1 + bits * 6 // 8

1 : 832855
2 : 417835
3 : 416012
4 : 833888
5 : 416189
6 : 416554
total 3333333
max/min 2.00448063998

The second example is the wasteful approach commonly used. You can see that it generates fewer random number from the same number of random bits, but the bias is eliminated.

def wasteful_die(total_bit_count):
    for i in xrange(total_bit_count // 3):
        bits = random.getrandbits(3)
        if bits < 6:
            yield 1 + bits

1 : 417043
2 : 415812
3 : 417835
4 : 416012
5 : 416645
6 : 417243
total 2500590
max/min 1.00486517946

The final example takes 13 bits at a time and generates 5 random numbers from it. This generates even more numbers than the naive approach!

def optimized_die(total_bit_count):
    for i in xrange(total_bit_count // 13):
        bits = random.getrandbits(13)
        if bits < 6**5:
            for j in range(5):
                yield 1 + bits % 6
                bits //= 6

1 : 608776
2 : 608849
3 : 608387
4 : 608119
5 : 607855
6 : 608559
total 3650545
max/min 1.00163525841

The choice of 13 bits was made by taking the logarithm base 6 of powers of 2 and choosing the one that was closest to an integer.

def waste_list(n):
    for bit in range(1, 31):
        potential = math.log(2**bit, n)
        count = int(potential)
        if count > 0:
            waste = potential - count
            yield waste, bit, count

for waste, bit, count in sorted(waste_list(6)):
    print bit, count, waste
    if bit == 3:
        break

13 5 0.029086494049
26 10 0.0581729880981
8 3 0.0948224578763
21 8 0.123908951925
3 1 0.160558421704

As you can see, there are 4 choices better than the simple 3 bits.

How to generate a random integer in the range [0,n] from a stream of random bits without wasting bits?

This is equivalent to find a two-way function between two set of different (finite) cardinality. It is impossible.

implement a function that generates an random number between a range given an biased random function

You can skew a biased random function to become unbiased by checking for a sequence of 01 or 10 and ignoring other results, this way you have a fair coin with a 50% chance of outputting any of the said sequences ((1-p)*p == p*(1-p)

With this fair coin you can then roll 3 bits and output the rolled number, if you roll a 7 (111) just repeat the process.

Generating a random integer within a range for each day

Algorithm:

Seed RNG with current day
Generate one random number
Mod y, add x

Replace step three with a smarter algorithm if you want uniform probabilities.

EDIT: ok, you don't have a PRNG. Then you might want to apply some hash algorithm to the current date and treat that as a random number.

How to implement an unbiased random method for signed integers within a range?

Alright, I got it to work. I use a wrapping add around <$type>::MAX / 2 + 1 to map the range of a signed integer to an unsigned integer.

fn random_range<B: RangeBounds<Self>>(r: &mut R, bounds: B) -> Self {
    const SIGNED_MAPPING: u64 = <u64>::MAX / 2 + 1;
    let lower = match bounds.start_bound() {
        Bound::Included(lower) => *lower,
        Bound::Excluded(lower) => lower.saturating_add(1),
        Bound::Unbounded => <i64>::MIN
    };
    let upper = match bounds.end_bound() {
        Bound::Included(upper) => *upper,
        Bound::Excluded(upper) => upper.saturating_sub(1),
        Bound::Unbounded => <i64>::MAX,
    };
    let lower = (lower as u64).wrapping_add(SIGNED_MAPPING);
    let upper = (upper as u64).wrapping_add(SIGNED_MAPPING);
    assert!(upper >= lower, "{} >= {}", upper, lower);
    <u64>::random_range(r, lower..=upper).wrapping_add(SIGNED_MAPPING) as i64
}

Unbiased random number generator using a biased one

The events (p)(1-p) and (1-p)(p) are equiprobable. Taking them as 0 and 1 respectively and discarding the other two pairs of results you get an unbiased random generator.

In code this is done as easy as:

int UnbiasedRandom()
{
    int x, y;

    do
    {
        x = BiasedRandom();
        y = BiasedRandom();
    } while (x == y);

    return x;
}

Generating random unsigned integers within a range biased toward the middle

If you want to have a biased random distribution from a sample of values, you can use the rand crate's rand::distributions::weighted::WeightedIndex to have fine grain control over your biasness by defining weights of each item in the sample.

use rand::prelude::*;
use rand::distributions::WeightedIndex;

fn main(){

    let mut rng = thread_rng();
    //item value and it's weight increasing till middle and then decreasing till end
    let sample_item = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 3), ('f', 2), ('g', 1)];

    let weight_dist = WeightedIndex::new(sample_item.iter().map(|(_, weight)| weight)).unwrap();

    let mut pool = vec![];

    for _ in 1..100{
        let item = sample_item[weight_dist.sample(&mut rng)];
        pool.push(item.0);
    }
    println!("{:?}", pool.iter().filter(|x| **x == 'a').count());
    println!("{:?}", pool.iter().filter(|x| **x == 'b').count());
    println!("{:?}", pool.iter().filter(|x| **x == 'c').count());
    println!("{:?}", pool.iter().filter(|x| **x == 'd').count());
    println!("{:?}", pool.iter().filter(|x| **x == 'e').count());
    println!("{:?}", pool.iter().filter(|x| **x == 'f').count());
}

You can try out the code here

What Is the Optimal Algorithm for Generating an Unbiased Random Integer Within a Range