How to Pick an Item by Its Probability

How to pick an item by its probability?

So with each item store a number that marks its relative probability, for example if you have 3 items one should be twice as likely to be selected as either of the other two then your list will have:

 [{A,1},{B,1},{C,2}]

Then sum the numbers of the list (i.e. 4 in our case).
Now generate a random number and choose that index.
int index = rand.nextInt(4);
return the number such that the index is in the correct range.

Java code:

class Item {
    int relativeProb;
    String name;

    //Getters Setters and Constructor
}

...

class RandomSelector {
    List<Item> items = new List();
    Random rand = new Random();
    int totalSum = 0;

    RandomSelector() {
        for(Item item : items) {
            totalSum = totalSum + item.relativeProb;
        }
    }

    public Item getRandom() {

        int index = rand.nextInt(totalSum);
        int sum = 0;
        int i=0;
        while(sum < index ) {
             sum = sum + items.get(i++).relativeProb;
        }
        return items.get(Math.max(0,i-1));
    }
}

Picking a random item based on probabilities

"I have various different sizes of cups of coffee. The larger they are, the more I want to charge for them. I'm having trouble actually figuring out how to assign prices".

This isn't just a programming problem - you've specified that probability increases with value, but you haven't said how it increases with value. Normally, coffee shops don't charge in direct proportion to the amount of coffee. You can't assign probabilities in proportion to value, because some of your values are negative, but probabilities cannot be negative.

Sounds like you need to nail down the problem a bit more before you can write any code.

If you really don't care how probability relates to value, other than that they increase in order of value, then one easy way would be:

sort your array
assign a probability of 1 to the first element, 2 to the second, and so on.
now, your probabilities don't add up to 1, which is a problem. So divide each probability by the total of all the probabilities you have assigned: (1 + 2 + .. + n) = n(n+1)/2. This is called "normalization".

Given your list of probabilities, which add up to 1, the easiest way to repeatedly choose one is generally to calculate the cumulative probabilities, which I will demonstrate with an example:

value (sorted):           -12     -3      127    1000000
assigned probability:     0.1     0.2     0.3      0.4
cumulative probability:   0.1     0.3     0.6      1.0

The cumulative probability is defined as the sum of all the probabilities up to that point.

Now, from your random number generator you need a random (floating-point) value between 0 and 1. If it lies between 0 and 0.1, you've picked -12. If it lies between 0.1 and 0.3, you've picked -3, and so on. To figure out which range it lies in, you could walk linearly through your array, or you could do a binary search.

You could skip the normalization step and the use of floating-point, if you wanted. Assign "cumulative probabilities" (1, 3, 6, 10 ...) , but make it understood that the actual probability is the stored integer value divided by n(n+1)/2. Then choose a random integer from 0 to n(n+1)/2 - 1. If it's less than 1, you've selected the first value, else if less than 3 the second, and so on. This may or may not make the code clearer, and your RNG may or may not do well choosing integer values from a large range.

Note that you could have assigned probabilities (0.001, 0.002, 0.003, 0.994) instead of (0.1, 0.2, 0.3, 0.4), and still satisfied your requirement that "the higher the value, the higher the probability".

How to pick an element from an array with a probability distribution?

You can create an array of 100 elements, then for each number i in the array goals, put i into the new array n times where n equals the probability of i times the array size (100). Then randomly pick an element from the new array.

You can increase the array size (100) to get a more precise result. The most precise result will be when you pick the array size that makes every n a whole number.

Example:

https://jsfiddle.net/o5hmjxd2/28/

const x = [0,1,2,3,4,5,6,7,8,9];
const p = [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]
const l = 100;

let y = [];

for (const i = 0; i < x.length; i++) {
  y = y.concat(new Array(l*p[i]).fill(x[i]));
}

function getRandomInt(max) {
  return Math.floor(Math.random() * max);
}

// result
console.log(y);
console.log(y[getRandomInt(y.length - 1)])

Selecting an Item from a List based on Probabilities and Constraints

You can "weight" each item and total up the weights. Then generate a random number between 1 and the total of the weights. Then find the item that matches that random number. Here are the steps:

Step 1: Calculate the points (or weight) of each and then total up all of the points.

    Item RunningTotal
     A        5        -- 5 points
     B        6        -- 1 point
     C        6        -- 0 points 
     D        8        -- 2 points
-------------------------------------
    TOTAL                 8 points

Step 2: Use your language's random number function to generate a random number between 1 and the total points. Something like:

int match = Math.Random(1, 8); -- 8 is the total of the points.

Step 3: Find the item that matches the random number.

int runningTotal = 0;
int index;

for(index = 1; index < Items.Count(); index++)
{
    if(Items[index].Points == 0)
        continue;

    runningTotal += Items[index].Points;

    if(runningTotal >= match)
       break;
}

return Items[index];   // the winner

How to select items from a list based on probability

Use random.choices:

>>> import random
>>> print(random.choices(
...     ['apple', 'gun', 'pizza', 'sword', 'pasta', 'chicken', 'elephant'], 
...     [0.1, 0.3, 0.1, 0.2, 0.1, 0.1, 0.1],
...     k=3
... ))
['gun', 'pasta', 'sword']

Edit: To avoid replacement, you can remove the selected item from the population:

def choices_no_replacement(population, weights, k=1):
    population = list(population)
    weigths = list(weights)    
    result = []
    for n in range(k):
        pos = random.choices(
            range(len(population)), 
            weights,
            k=1
        )[0]
        result.append(population[pos])
        del population[pos], weights[pos]
    return result

Testing:

>>> print(choices_no_replacement(
...     ['apple', 'gun', 'pizza', 'sword', 'pasta', 'chicken', 'elephant'],
...     [0.1, 0.3, 0.1, 0.2, 0.1, 0.1, 0.1],
...     k=3
... ))
['gun', 'pizza', 'sword']

Selecting Random Item from List given probability of each item

Here's a more generic way to do it (meaning you don't need to assert that the probabilities add to 1):

static Random rand = new Random();

public NGram GetRandom(IEnumerable<NGram> pool)
{
     // get universal probability 
     double u = pool.Sum (p => p.Probability);

     // pick a random number between 0 and u
     double r = rand.NextDouble() * u;

     double sum = 0;
     foreach(NGram n in pool)
     {
         // loop until the random number is less than our cumulative probability
         if(r <= (sum = sum + n.Probability))
         {
            return n;
         }
     }
     // should never get here
     return null;
}

Pick random element from list with probability

Define a class for your items like this:

class Items<T>
{
    public double Probability { get; set; }
    public T Item { get; set; }
}

then initialize it

var initial = new List<Items<string>>
{
    new Items<string> {Probability = 74 / 100.0, Item = "A"},
    new Items<string> {Probability = 15 / 100.0, Item = "B"},
    new Items<string> {Probability = 7 / 100.0, Item = "C"},
    new Items<string> {Probability = 4 / 100.0, Item = "D"},
};

then you need to convert it to aggregate a sum of probabilities from 0 to 1

var converted = new List<Items<string>>(initial.Count);
var sum = 0.0;
foreach (var item in initial.Take(initial.Count - 1))
{
    sum += item.Probability;
    converted.Add(new Items<string> {Probability = sum, Item = item.Item});
}
converted.Add(new Items<string> {Probability = 1.0, Item = initial.Last().Item});

now you can pick an item from converted collection with respect to probability:

var rnd = new Random();
while (true)
{
    var probability = rnd.NextDouble();
    var selected = converted.SkipWhile(i => i.Probability < probability).First();
    Console.WriteLine($"Selected item = {selected.Item}");
}

NOTE: my implementation have O(n) complexity. You can optimize it with binary search (because values in converted collection are sorted)

How to choose an item in a list according to a specific probability?

I'm assuming that you have a pre-calculated list of probabilities (say probs) for each of the indexes in the list (say data) you want to choose from.

Further, probs and data obviously must have the same length and the entries of probs must be non-negative numbers summing to 1.

There is a neat but simple technique to randomly choose the indexes in data according to the distribution in probs that is known as the roulette wheel. In Python, I believe, it should look somehow like this

import random

data = ['A', 'B', 'C', 'D']

probs = [0.2, 0.4, 0.3, 0.1]

def roulette_wheel(probs):
    rand = random.random()
    for slot, prob in enumerate(probs):
        rand -= prob
        if rand < 0.0:
            return slot

Note that this can be generalized to a list of non-negative weights (that doesn't have to add up to 1) by multiplying rand by the term sum(weights). I believe, I first saw this cute idea in a book about Pascal programming some eons ago.

Edit:

As MadPhysicist suggested in a comment this can be made a lot more efficient if one needs to draw repeatedly from the same data. In that case, one can pre-compute the cumulative distribution function and then just do a binary search for the index such that cumulative prob. <= rand ~ U(0, 1). In Python, this could for example look somehow like the following

from random import random
from bisect import bisect_right

def cdf(probs):
    cdf = []
    total = 0.0
    for p in probs:
        total += p
        cdf.append(total)
    return cdf

def roulette_wheel_bisect(cdf):
    return bisect_right(cdf, random())

# compute cdf
cumsum = cdf(probs)

# randomly draw 10 indexes 
for i in range(0, 10):
    print(roulette_wheel_bisect(cumsum))

Disclaimer: I'm not a Python programmer by trade, so the code above should only illustrate the general idea. It may not be very robust for practical uses. You should always use a well-tested standard library, numpy for example, if you can.

Edit2:

I've just learned that numpy has numpy.random.choice which does exactly what you need. Example:

from numpy import random

data = ['A', 'B', 'C', 'D']
probs = [0.2, 0.4, 0.3, 0.1]

# randomly draw 10 list elements with replacement
for i in range(0, 10):
    print(random.choice(data, p=probs))

How to pick random items through cumulative probability?

I find it easier to work with integers in this type of problem, so I'll work with:

Food - 10
Weapons - 5
Enemy - 5
Trap - 3

That gives a total of 10 + 5 + 5 + 3 = 23 total possible options.

Most computer RNGs work from base 0, so split the 23 options (as in 0..22) like this:

Food - 0..9 giving 10 options.
Weapons - 10..14 giving 5 options.
Enemy - 15..19 giving 5 options.
Trap - 20..22 giving 3 options.

Work through the possibilities in order, stopping when you reach the selected option. I will use pseudocode as my C++ is very rusty:

function pickFWET()

  pick <- randomInRange(0 to 22);

  if (pick < 10) return FOOD;
  if (pick < 15) return WEAPONS;      
  if (pick < 20) return ENEMY;
  if (pick < 23) return TRAP;

  // If we reach here then there was an error.
  throwError("Wrong pick in pickFWET");

end function pickFWET

How to Pick an Item by Its Probability