Choose Specific Number with Probability

Choose specific number with probability

Take a look at sample function.

> set.seed(1)
> sample(c(0,1), size=10, replace=TRUE, prob=c(0.2,0.8))
 [1] 1 1 1 0 1 0 0 1 1 1

From the helpfile you can read:

sample takes a sample of the specified size from the elements of x using either with or without replacement.

and the argument prob in sample acts as ...

A vector of probability weights for obtaining the elements of the vector being sampled.

c++, how randomly with given probabilities choose numbers

This is very similar to the answer I gave for this question:

changing probability of getting a random number

You can do it like this:

double val = (double)rand() / RAND_MAX;

int random;
if (val < p_1)
    random = n_1;
else if (val < p_1 + p_2)
    random = n_2;
else if (val < p_1 + p_2 + p_3)
    random = n_3;
else
    random = n_4;

Of course, this approach only makes sense if p_1 + p_2 + p_3 + p_4 == 1.0.

This can easily be generalized to a variable number of outputs and probabilities with a couple of arrays and a simple loop.

selection of numbers from a set with equal probability

Something like this

for elem in S
  if random() < (k - S'.size)/S.size // This is float division
    S'.add(elem)

The first element is chosen with probability k/n, the second one with (n-k)/n * k/(n-1) + k/n * (k-1)/(n-1) which reduces to k/n, etc.

Python: Selecting numbers with associated probabilities

This might be what you're looking for. Extension to a solution in Generate random numbers with a given (numerical) distribution. Removes the selected item from the distribution, updates the probabilities and returns selected item, updated distribution. Not proven to work, but should give a good impression of the idea.

def random_distr(l):
    assert l # don't accept empty lists
    r = random.uniform(0, 1)
    s = 0
    for i in xrange(len(l)):
        item, prob = l[i]
        s += prob
        if s >= r:
            l.pop(i) # remove the item from the distribution
            break
    else: # Might occur because of floating point inaccuracies
        l.pop()
    # update probabilities based on new domain
    d = 1 - prob 
    for i in xrange(len(l)):
        l[i][1] /= d
    return item, l

dist = [[1, 0.5], [2, 0.25], [3, 0.05], [4, 0.01], [5, 0.09], [6, 0.1]]
while dist:
    val, dist = random_distr(dist)
    print val

How to pick an item by its probability?

So with each item store a number that marks its relative probability, for example if you have 3 items one should be twice as likely to be selected as either of the other two then your list will have:

 [{A,1},{B,1},{C,2}]

Then sum the numbers of the list (i.e. 4 in our case).
Now generate a random number and choose that index.
int index = rand.nextInt(4);
return the number such that the index is in the correct range.

Java code:

class Item {
    int relativeProb;
    String name;

    //Getters Setters and Constructor
}

...

class RandomSelector {
    List<Item> items = new List();
    Random rand = new Random();
    int totalSum = 0;

    RandomSelector() {
        for(Item item : items) {
            totalSum = totalSum + item.relativeProb;
        }
    }

    public Item getRandom() {

        int index = rand.nextInt(totalSum);
        int sum = 0;
        int i=0;
        while(sum < index ) {
             sum = sum + items.get(i++).relativeProb;
        }
        return items.get(Math.max(0,i-1));
    }
}

Generate random numbers with a given (numerical) distribution

scipy.stats.rv_discrete might be what you want. You can supply your probabilities via the values parameter. You can then use the rvs() method of the distribution object to generate random numbers.

As pointed out by Eugene Pakhomov in the comments, you can also pass a p keyword parameter to numpy.random.choice(), e.g.

numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

If you are using Python 3.6 or above, you can use random.choices() from the standard library – see the answer by Mark Dickinson.

Random number with Probabilities

Yours is a pretty good way already and works well with any range.

Just thinking: another possibility is to get rid of the fractions by multiplying with a constant multiplier, and then build an array with the size of this multiplier. Multiplying by 10 you get

P(1) = 2
P(2) = 3
P(3) = 5

Then you create an array with the inverse values -- '1' goes into elements 1 and 2, '2' into 3 to 6, and so on:

P = (1,1, 2,2,2, 3,3,3,3,3);

and then you can pick a random element from this array instead.

(Add.) Using the probabilities from the example in kiruwka's comment:

int[] numsToGenerate           = new int[]    { 1,   2,    3,   4,    5   };
double[] discreteProbabilities = new double[] { 0.1, 0.25, 0.3, 0.25, 0.1 };

the smallest multiplier that leads to all-integers is 20, which gives you

2, 5, 6, 5, 2

and so the length of numsToGenerate would be 20, with the following values:

1 1
2 2 2 2 2
3 3 3 3 3 3
4 4 4 4 4
5 5

The distribution is exactly the same: the chance of '1', for example, is now 2 out of 20 -- still 0.1.

This is based on your original probabilities all adding up to 1. If they do not, multiply the total by this same factor (which is then going to be your array length as well).

Decreasing chance of choosing a number from a list of consecutive numbers

This seems like you would want to work with a generic probability distribution whose domain can be scaled to your liking. You could chose any function such that f(0) = 0 and f(1) = 1. For this example I will take f(x) = x^2.

To get a random numbers from here - with more values concentrated closer to 0 - we can do the following:

numbers = ceil(max * f(rand()))

where ceil is the ceiling function, max is the highest output you would like, f() is the function you chose, and rand() gives a random number between zero and one. Do note that the outputs of this function would be in a range from 1 to max and not 0 to max.

The following graph should give you some idea of why this actually works:

Graph of <code>ceil(f(x))</code> for <code>max == 10</code>)

Notice there is a diminishing chance of an integer being chosen as the integers grow larger - i.e. the ceil(max*f(x)) is equal to one the "longest" and 10 the "shortest".

If you would like a direct relationship between the number chosen and its magnitude you would simply need to chose a different f(x). However, at this point this is turning into more of a mathematics question than anything else. I will look for a proper f(x) - if i understand what you are looking for at least and get back to you. I am guessing as of now f(x) will be e^x but I will double check.

I hope this helps!

A quick code example:

public int weightedRandom(int max, Random rand) {
     return Math.ceil(((double) max) * Math.pow(rand.nextDouble(), 2));
}

I also printed a couple out in a java program and got the following list where max == 10:

2.0, 6.0, 8.0, 3.0, 2.0, 2.0, 1.0, 1.0, 1.0, 1.0, 7.0, 1.0, 4.0, 1.0, 1.0, 6.0, 8.0, 9.0, 7.0, 5.0

Probability of selecting an element from a set

With repetitions, your distribution will be binomial. So let X be the number of times you select some fixed object, with M total selections

P{ X = x } = ( M choose x ) * (1/N)^x * (N-1/N)^(M-x)

You may find this difficult to compute for large N. It turns out that for sufficiently large N, this actually converges to a normal distribution with probability 1 (Central Limit theorem).

In case P{X=x} will be given by a normal distribution. The mean will be M/N and the variance will be M * (1/N) * ( N-1) / N.

Random number to select winners based on probability

I was thinking about this and I think my result makes sense:

sort the vector according to the probability: [a=0.1,b=0.2,c=0.3,d=0.4]
choose a random number (e.g. 0.5)
iterate from the beginning and sum the probability values and stop when it is higher:
answer = 0.1 + 0.2 + 0.3. So, 0.6 > 0.5, we value 'c'

my gist over this is that the values in the end of the vector should have a higher probability of being chosen. I have implemented in python:

values = [0.1,0.2,0.3,0.4]
count_values = len(values)*[0]
answer = len(values)*[0]

iterations = 10000 

for i in range(0,iterations):
    rand = float(random.randint(0,iterations))/iterations
    count = 0
    sum = 0
    while sum <= rand and count <= len(values):
        sum += values[count]
        count += 1
    count_values[count-1]+=1

for i in range(0,len(count_values)):
    answer[i] = float(count_values[i])/iterations

and running several times, I calculated the probability of all elements being chosen, that should match our initial probability:

[0.1043, 0.196, 0.307, 0.3927]
[0.1018, 0.2003, 0.2954, 0.4025]
[0.0965, 0.1997, 0.3039, 0.3999]

Choose Specific Number with Probability