Weighted Random Numbers

Weighted random numbers

There is a straightforward algorithm for picking an item at random, where items have individual weights:

1) calculate the sum of all the weights

2) pick a random number that is 0 or greater and is less than the sum of the weights

3) go through the items one at a time, subtracting their weight from your random number, until you get the item where the random number is less than that item's weight

Pseudo-code illustrating this:

int sum_of_weight = 0;
for(int i=0; i<num_choices; i++) {
   sum_of_weight += choice_weight[i];
}
int rnd = random(sum_of_weight);
for(int i=0; i<num_choices; i++) {
  if(rnd < choice_weight[i])
    return i;
  rnd -= choice_weight[i];
}
assert(!"should never get here");

This should be straightforward to adapt to your boost containers and such.

If your weights are rarely changed but you often pick one at random, and as long as your container is storing pointers to the objects or is more than a few dozen items long (basically, you have to profile to know if this helps or hinders), then there is an optimisation:

By storing the cumulative weight sum in each item you can use a binary search to pick the item corresponding to the pick weight.

If you do not know the number of items in the list, then there's a very neat algorithm called reservoir sampling that can be adapted to be weighted.

How can I do weighted random number generation that favors some numbers over others, depending on user input?

It looks like you want std::discrete_distribution:

std::discrete_distribution produces random integers on the interval [0, n), where the probability of each individual integer i is defined as w_i = i/S, that is the weight of the ith integer divided by the sum of all n weights. [cppreference.com]

If your table is std::vector<unsigned int> weights, you can write

std::random_device rd;
std::mt19937 gen(rd());
std::discrete_distribution<> d(weights.begin(), weights.end());

and then use d(gen) to get random numbers.

If the weights array has static size known at compile-time, you can use std::array with exactly the same syntax.

Weighted Random Number generator with updates

One possible solution comes from the arithmetic coding and Fenwick trees.

If you have a list of non-negative numbers, [a_0, ... a_n] of type T, the Fenwick tree data structure allows you to implement the following two functions in O(log n) time:

Index upper_bound(T p): for the given value p, calculate the smallest index i, such that the prefix sum a_0 + ... + a_i > p.
set(Index i, T p): Update a_i <- p.

The algorithm of generating a random i is simple: generate a random number k uniformly distributed in the range [0, sum a_i) and then use i = upper_bound(k) to find i.

Simple example:

i            0 1 2 3 4 5 6 7
a_i          0 1 0 0 3 4 0 2
prefix_sum   0 1 1 1 4 8 8 10

k                   0 1 2 3 4 5 6 7 8 9
i = upper_bound(k)  1 4 4 4 5 5 5 5 7 7

P.Fenwick. A New Data Structure for Cumulative Frequency Tables (PDF, 1994)

My C++ implementation of a Fenwick tree (not thoroughly tested)

Generate A Weighted Random Number

Rejection sampling (such as in your solution) is the first thing that comes to mind, whereby you build a lookup table with elements populated by their weight distribution, then pick a random location in the table and return it. As an implementation choice, I would make a higher order function which takes a spec and returns a function which returns values based on the distribution in the spec, this way you avoid having to build the table for each call. The downsides are that the algorithmic performance of building the table is linear by the number of items and there could potentially be a lot of memory usage for large specs (or those with members with very small or precise weights, e.g. {0:0.99999, 1:0.00001}). The upside is that picking a value has constant time, which might be desirable if performance is critical. In JavaScript:

function weightedRand(spec) {
  var i, j, table=[];
  for (i in spec) {
    // The constant 10 below should be computed based on the
    // weights in the spec for a correct and optimal table size.
    // E.g. the spec {0:0.999, 1:0.001} will break this impl.
    for (j=0; j<spec[i]*10; j++) {
      table.push(i);
    }
  }
  return function() {
    return table[Math.floor(Math.random() * table.length)];
  }
}
var rand012 = weightedRand({0:0.8, 1:0.1, 2:0.1});
rand012(); // random in distribution...

Another strategy is to pick a random number in [0,1) and iterate over the weight specification summing the weights, if the random number is less than the sum then return the associated value. Of course, this assumes that the weights sum to one. This solution has no up-front costs but has average algorithmic performance linear by the number of entries in the spec. For example, in JavaScript:

function weightedRand2(spec) {
  var i, sum=0, r=Math.random();
  for (i in spec) {
    sum += spec[i];
    if (r <= sum) return i;
  }
}
weightedRand2({0:0.8, 1:0.1, 2:0.1}); // random in distribution...

How to constantly adjust the weight for random number generator so the results can be distributed more evenly?

It appears that you want to simulate a weighted choice described in my section "Weighted Choice Without Replacement (Multiple Copies)". In your case, it could work as follows:

Give each item the same weight, specified as a positive integer. For example, give a weight of 20 to each item.
Use a weighted-choice-with-replacement algorithm. Perhaps the simplest is rejection sampling, described as follows. Assume that the highest weight is max and each weight is 0 or greater. To choose an integer in [1, weights.Count] using rejection sampling:
1. Choose a uniform random integer i in [1, weights.Count].
2. With probability weights[i]/max, return i. Otherwise, go to step 1.
There are many other ways to make a weighted choice besides rejection sampling; see my note on weighted choice algorithms.
As each item is chosen, reduce its weight by 1 to make it less likely to be chosen.
If all the weights are 0, assign each item the same weight chosen in step 1 (in this example, 20).

Since you mention Unity, your goal is probably to make a game that controls which random numbers appear, to make the random outcomes appear "fairer" to players. However, you should also consider whether it may be better to make an independent uniform random choice instead, especially if you care whether players could gain an unfair advantage by predicting the random outcomes.

C# weighted random numbers

I agree with @Timothy, I'd go for a more maintainable solution, where you're not relying on magic numbers to split your probabilities. Also, it's personal preference, but I'd also call it ratio rather than percent, otherwise "100" becomes another magic number, and you limit yourself to a minimum probability of 1%. This way you can split it 1:10:200 or however you please:

public static readonly int RATIO_CHANCE_A = 10;
public static readonly int RATIO_CHANCE_B = 30;
//                         ...
public static readonly int RATIO_CHANCE_N = 60;

public static readonly int RATIO_TOTAL = RATIO_CHANCE_A
                                       + RATIO_CHANCE_B
                                         // ...
                                       + RATIO_CHANCE_N;

Random random = new Random();
int x = random.Next(0, RATIO_TOTAL);

if ((x -= RATIO_CHANCE_A) < 0) // Test for A
{ 
     do_something1();
} 
else if ((x -= RATIO_CHANCE_B) < 0) // Test for B
{ 
     do_something2();
}
// ... etc
else // No need for final if statement
{ 
     do_somethingN();
}

EDIT: More generalised solution

Weighted random number generation in R

See ?sample.

For example:

sample(c(1, 2, 3), size = 100, replace = TRUE, prob = c(0.1, 0.5, 0.4))

Weighted random selection from array

Compute the discrete cumulative density function (CDF) of your list -- or in simple terms the array of cumulative sums of the weights. Then generate a random number in the range between 0 and the sum of all weights (might be 1 in your case), do a binary search to find this random number in your discrete CDF array and get the value corresponding to this entry -- this is your weighted random number.

Weighted Random Numbers