Pick N Distinct Items at Random from Sequence of Unknown Length, in Only One Iteration

Pick N distinct items at random from sequence of unknown length, in only one iteration

Use reservoir sampling. It's a very simple algorithm that works for any N.

Here is one Python implementation, and here is another.

How to choose m random numbers from a range but with no duplicate?

Use the built-in random sample:

>>> import random
>>> random.sample(range(10), 5) # take 5 random elements from range(10)
[2, 4, 1, 7, 9]

randomly seek a small sequence of a particular length in a larger sequence in python

Instead of trying to choose a character from X using random.choice , if you want a sequence of length 4 in continuity, choose an index between 0 and length of X - 4 , and take the 4 elements from that index. Example -

>>> X = 'ATGCATGCTAGCTAGTAAACGTACGTACGTACGATGCTAATATAGAGGGGCTTCGTACCCCTGA'
>>> import random
>>> i = random.randint(0,len(X)-4)
>>> X[i:i+4]
'TGCA'
>>> i
1

Uniformly and randomly choose M elements from N elements - confused

I think your confusion there may be that you're not discriminating between choosing a sequence, from choosing a set.

In your first procedure, just because you only have a 1/N chance of choosing a particular element in the first round, doesn't mean you won't choose it in a subsequent round. The element has a 1/N chance of being the first element in the result... but it has a M/N chance of being chosen during some round. So that works. Take M=2, N=4: The chance of an element being picked is 1/4 + (3/4)*(1/3) = 2/4.

As for your second procedure, following the shuffle, each element's position within the array is uniformly distributed, so there's a M/N chance that its position is equal to or less than M (and is hence chosen). So that works too.

Select N random elements from a ListT in C#

Iterate through and for each element make the probability of selection = (number needed)/(number left)

So if you had 40 items, the first would have a 5/40 chance of being selected. If it is, the next has a 4/39 chance, otherwise it has a 5/39 chance. By the time you get to the end you will have your 5 items, and often you'll have all of them before that.

This technique is called selection sampling, a special case of Reservoir Sampling. It's similar in performance to shuffling the input, but of course allows the sample to be generated without modifying the original data.

Python pick 20 random results from list

If you want 20 unique values in random order, use random.sample():

random.sample(coordinates, 20)
random.sample(population, k)¶

Return a k length list of unique elements chosen from the population sequence or set. Used for random sampling without replacement.

>>> random.sample(coordinates, 20)
[[80, 60], [40, 100], [80, 100], [60, 80], [60, 100], [40, 60], [40, 80], [80, 120], [120, 140], [120, 100], [100, 80], [40, 120], [80, 140], [100, 140], [20, 80], [120, 80], [100, 100], [20, 40], [120, 120], [100, 120]]

You could use random.choice() 20 times, but this will not be "unique"—elements may be duplicated, because one is randomly selected each time:

>>> [random.choice(coordinates) for _ in range(20)]
[[80, 80], [40, 140], [80, 140], [60, 60], [120, 100], [20, 120], [100, 80], [120, 100], [20, 60], [100, 120], [100, 40], [80, 80], [100, 80], [80, 120], [20, 40], [100, 80], [60, 80], [80, 140], [40, 40], [120, 40]]


Related Topics



Leave a reply



Submit