A weighted version of random.choice
Since version 1.7.0, NumPy has a choice
function that supports probability distributions.
from numpy.random import choice
draw = choice(list_of_candidates, number_of_items_to_pick,
p=probability_distribution)
Note that probability_distribution
is a sequence in the same order of list_of_candidates
. You can also use the keyword replace=False
to change the behavior so that drawn items are not replaced.
Weighted random sample without replacement in python
You can use np.random.choice
with replace=False
as follows:
np.random.choice(vec,size,replace=False, p=P)
where vec
is your population and P
is the weight vector.
For example:
import numpy as np
vec=[1,2,3]
P=[0.5,0.2,0.3]
np.random.choice(vec,size=2,replace=False, p=P)
2D version of numpy random choice with weighting
I don't think it's possible to directly specify a 2D shaped array of probabilities. So raveling should be fine. However to get the corresponding 2D shaped indices from the flat index you can use np.unravel_index
index= np.unravel_index(xy.item(), x.shape)
# (4, 2)
For multiple indices, you can just stack the result:
xy=np.random.choice(x.flatten(),3,p=p.flatten())
indices = np.unravel_index(xy, x.shape)
# (array([4, 4, 5], dtype=int64), array([1, 2, 3], dtype=int64))
np.c_[indices]
array([[4, 1],
[4, 2],
[5, 3]], dtype=int64)
where np.c_
stacks along the right hand axis and gives the same result as
np.column_stack(indices)
Python: unique weighted random values
You could take all the samples all at once using numpy's random.choice with the replace = False
option (assuming the weights are just renormalized between steps,) and store them using multiple assignment, to get it into one line of code.
import numpy as np
slot_1, slot_2, slot_3 = np.random.choice(list(Weights.ITEM.keys()), size = 3, replace=False, p=list(Weights.ITEM.values()))
More generally, you could have a function that generated arbitrary length subsamples (k is length, n is number of samples):
def a(n,k,values,weights):
a = np.split(np.random.choice(values, size = n*k,replace=False, p=weights), n)
return [list(sublist) for sublist in a]
>>> a(3,5, range(100), [.01]*100)
[[39, 34, 27, 91, 88], [19, 98, 62, 55, 38], [37, 22, 54, 11, 84]]
A weighted version of random.randint
You use
random.choices(range(a,b+1), weights= [....], k=1) # or cum_weights
for a k
of 1 and a population in range(a,b+1)
and the weights you want.
See: https://docs.python.org/3/library/random.html#random.choices
You would have to calculate a possible (arbritrary) weighting, f.e.:
import random
from collections import defaultdict
a = 8
b = 32
c = 26
# hacked distribution
w = [(i-a)**2 if i <= c else (b-i+a)**2 for i in range(a,b+1)]
d=defaultdict(int)
for i in range(a,b+1):
d[i]=0
# test for 10k numbers
for num in random.choices(range(a,b+1), weights = w, k=10000):
d[num] += 1
print(w)
print(d)
It is still random, one run got me:
# hacked distribution
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225,
256, 289, 196, 169, 144, 121, 100, 81, 64]
# test for 10k numbers
{8: 0, 9: 8, 10: 7, 11: 37, 12: 61, 13: 94, 14: 149, 15: 175, 16: 229,
17: 283, 18: 374, 19: 450, 20: 493, 21: 628, 22: 672, 23: 820, 24: 907,
25: 1038, 26: 1183, 27: 564, 28: 537, 29: 435, 30: 325, 31: 293, 32: 238}
Related Topics
Why Does "A == X or Y or Z" Always Evaluate to True
How to Select Rows from a Dataframe Based on Column Values
How to Sort a Dictionary by Value
Why Is Using 'Eval' a Bad Practice
How to Make a Sprite Move When Key Is Held Down
How to Get Keyboard Input in Pygame
Unboundlocalerror on Local Variable When Reassigned After First Use
Converting String "Jun 1 2005 1:33Pm" into Datetime
Difference Between _Str_ and _Repr_
Accessing the Index in 'For' Loops
Running Shell Command and Capturing the Output
How to Pandas Group-By to Get Sum
What Is the Meaning of Single and Double Underscore Before an Object Name
How to Terminate Process from Python Using Pid