Generate a Weighted Random Number

Weighted random numbers

There is a straightforward algorithm for picking an item at random, where items have individual weights:

1) calculate the sum of all the weights

2) pick a random number that is 0 or greater and is less than the sum of the weights

3) go through the items one at a time, subtracting their weight from your random number, until you get the item where the random number is less than that item's weight

Pseudo-code illustrating this:

int sum_of_weight = 0;
for(int i=0; i<num_choices; i++) {
sum_of_weight += choice_weight[i];
}
int rnd = random(sum_of_weight);
for(int i=0; i<num_choices; i++) {
if(rnd < choice_weight[i])
return i;
rnd -= choice_weight[i];
}
assert(!"should never get here");

This should be straightforward to adapt to your boost containers and such.


If your weights are rarely changed but you often pick one at random, and as long as your container is storing pointers to the objects or is more than a few dozen items long (basically, you have to profile to know if this helps or hinders), then there is an optimisation:

By storing the cumulative weight sum in each item you can use a binary search to pick the item corresponding to the pick weight.


If you do not know the number of items in the list, then there's a very neat algorithm called reservoir sampling that can be adapted to be weighted.

How can I do "weighted" random number generation that favors some numbers over others, depending on user input?

It looks like you want std::discrete_distribution:

std::discrete_distribution produces random integers on the interval [0, n), where the probability of each individual integer i is defined as w_i = i/S, that is the weight of the ith integer divided by the sum of all n weights. [cppreference.com]

If your table is std::vector<unsigned int> weights, you can write

std::random_device rd;
std::mt19937 gen(rd());
std::discrete_distribution<> d(weights.begin(), weights.end());

and then use d(gen) to get random numbers.

If the weights array has static size known at compile-time, you can use std::array with exactly the same syntax.

Generate A Weighted Random Number

Rejection sampling (such as in your solution) is the first thing that comes to mind, whereby you build a lookup table with elements populated by their weight distribution, then pick a random location in the table and return it. As an implementation choice, I would make a higher order function which takes a spec and returns a function which returns values based on the distribution in the spec, this way you avoid having to build the table for each call. The downsides are that the algorithmic performance of building the table is linear by the number of items and there could potentially be a lot of memory usage for large specs (or those with members with very small or precise weights, e.g. {0:0.99999, 1:0.00001}). The upside is that picking a value has constant time, which might be desirable if performance is critical. In JavaScript:

function weightedRand(spec) {
var i, j, table=[];
for (i in spec) {
// The constant 10 below should be computed based on the
// weights in the spec for a correct and optimal table size.
// E.g. the spec {0:0.999, 1:0.001} will break this impl.
for (j=0; j<spec[i]*10; j++) {
table.push(i);
}
}
return function() {
return table[Math.floor(Math.random() * table.length)];
}
}
var rand012 = weightedRand({0:0.8, 1:0.1, 2:0.1});
rand012(); // random in distribution...

Another strategy is to pick a random number in [0,1) and iterate over the weight specification summing the weights, if the random number is less than the sum then return the associated value. Of course, this assumes that the weights sum to one. This solution has no up-front costs but has average algorithmic performance linear by the number of entries in the spec. For example, in JavaScript:

function weightedRand2(spec) {
var i, sum=0, r=Math.random();
for (i in spec) {
sum += spec[i];
if (r <= sum) return i;
}
}
weightedRand2({0:0.8, 1:0.1, 2:0.1}); // random in distribution...

Weighted Random Number generator with updates

One possible solution comes from the arithmetic coding and Fenwick trees.

If you have a list of non-negative numbers, [a_0, ... a_n] of type T, the Fenwick tree data structure allows you to implement the following two functions in O(log n) time:

  1. Index upper_bound(T p): for the given value p, calculate the smallest index i, such that the prefix sum a_0 + ... + a_i > p.
  2. set(Index i, T p): Update a_i <- p.

The algorithm of generating a random i is simple: generate a random number k uniformly distributed in the range [0, sum a_i) and then use i = upper_bound(k) to find i.

Simple example:

i            0 1 2 3 4 5 6 7
a_i 0 1 0 0 3 4 0 2
prefix_sum 0 1 1 1 4 8 8 10

k 0 1 2 3 4 5 6 7 8 9
i = upper_bound(k) 1 4 4 4 5 5 5 5 7 7

P.Fenwick. A New Data Structure for Cumulative Frequency Tables (PDF, 1994)

My C++ implementation of a Fenwick tree (not thoroughly tested)

How to generate a random weighted distribution of elements

C will appear eight times more often than D; D will appear 5 times less often than B; A will appear three times less often than C.

You can do that with a weighted array of your elements:

var elems = ["A", "B", "C", "D"];
var weights = [2, 5, 8, 1]; // weight of each element above
var totalWeight = weights.reduce(add, 0); // get total weight (in this case, 16)

function add(a, b) { return a + b; } // helper function

var weighedElems = [];
var currentElem = 0;
while (currentElem < elems.length) {
for (i = 0; i < weights[currentElem]; i++)
weighedElems[weighedElems.length] = elems[currentElem];
currentElem++;
}

console.log(weighedElems);

This will produce an array like

["A", "A", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", "C", "D"]

so you can choose randomly from that like so

var rnd = Math.floor(Math.random() * totalWeight);
console.log(weighedElems[rnd]);

Resources:

  • Generate A Weighted Random Number
  • Weighted random number generation in Javascript
  • Algorithm for generating weighed random numbers

Generating weighted random numbers

Python doesn't have any weighted sampling functionality built in (NumPy/SciPy does), but for a really simple case like this, it's pretty easy:

import itertools
import random

probabilities = [0.3, 0.2, 0.5]
totals = list(itertools.accumulate(probabilities))

def sample():
n = random.uniform(0, totals[-1])
for i, total in enumerate(totals):
if n <= total:
return i

If you don't have Python 3.2+, you don't have the accumulate function; you can fake it with an inefficient one-liner if the list really is this short:

totals = [sum(probabilities[:i+1]) for i in range(len(probabilities))]

… or you can write an explicit loop, or an ugly reduce call, or copy the equivalent Python function from the docs.


Also, note that random.uniform(0, totals[-1]) is just a more complicated way of writing random.random() if you can be sure that your numbers add up to 1.0.


A quick way to test this:

>>> samples = [sample() for _ in range(100000)]
>>> samples.count(0)
29878
>>> samples.count(1)
19908
>>> samples.count(2)
50214

Those are pretty close to 30%, 20%, and 50% of 100000, respectively.

JAVA - How do I generate a random number that is weighted towards certain numbers?

public static void main(String[] args){
ArrayList<Integer> list = new ArrayList<Integer>();
list.add(5);list.add(2);// to be continued ...
int randomInt = list.get((int) (Math.random() * list.size()));
System.out.println(randomInt);
}

You'll get your List of values, and then you just need to pick randomly an index and get back the value

Math.random() will pick a value in [0;1[, if you multiply by the size of a List of 10 elements, you'll a value in [0;10[ , when you cast as an int you'll get 10 possibilities of index to look into the List

The ArrayList (implementation of List) allows duplicate element so :
if you add 1,2,2,3,4,5,6,7,8,9 (the order does not matter) you'll have 10 elements with each one to be find with 1/10 = 10/100 = 10% of probability (aka chance), but the element 2 which appears twice will have 20% chance to be choosen in the method, more the number appears more it's chance to be choosen increase, with as you said (number of time the number appears)/(total number of element) (this is basic maths)

Verilog - generate weighted random numbers

The SystemVerilog solution has a distribution method within randomize called dist. Weights are assigned by value_or_range := weight or value_or_range :/ distributed_weight. This exert from the IEEE Std 1800-2012 § 18.5.4 page 476 gives a clear example:

When weights are applied to ranges, they can be applied to each value in the range, or they can be applied to the range as a whole. For example:

x dist { [100:102] := 1, 200 := 2, 300 := 5}

means x is equal to 100, 101, 102, 200, or 300 with a weighted ratio of 1-1-1-2-5, and

x dist { [100:102] :/ 1, 200 := 2, 300 := 5}

means x is equal to one of 100, 101, 102, 200, or 300 with a weighted ratio of
1/3-1/3-1/3-2-5.

dist is used in randomization so it needs to be mare of a randomize() with (or a class constraint). randomize returns a success bit, therefore it should be in called within an assert, void'(), or the RHS of an assignment.

In your we can set the weight of 0 to 6 and the weight of 1 to 4, creating a total weight of 10 with a 60/40 distribution. Example:

reg R;
initial begin
assert( randomize(R) with { R dist { 0 := 6, 1 := 4 }; } );
end

From more about dist see IEEE Std 1800-2012 § 18.5.4 "Distribution".



Related Topics



Leave a reply



Submit