Select N Random Elements from a List<T> in C#

Select N random elements from a List T in C#

Iterate through and for each element make the probability of selection = (number needed)/(number left)

So if you had 40 items, the first would have a 5/40 chance of being selected. If it is, the next has a 4/39 chance, otherwise it has a 5/39 chance. By the time you get to the end you will have your 5 items, and often you'll have all of them before that.

This technique is called selection sampling, a special case of Reservoir Sampling. It's similar in performance to shuffling the input, but of course allows the sample to be generated without modifying the original data.

Select N random elements from a List T in C#

Iterate through and for each element make the probability of selection = (number needed)/(number left)

So if you had 40 items, the first would have a 5/40 chance of being selected. If it is, the next has a 4/39 chance, otherwise it has a 5/39 chance. By the time you get to the end you will have your 5 items, and often you'll have all of them before that.

This technique is called selection sampling, a special case of Reservoir Sampling. It's similar in performance to shuffling the input, but of course allows the sample to be generated without modifying the original data.

Randomize a List T

Shuffle any (I)List with an extension method based on the Fisher-Yates shuffle:

private static Random rng = new Random();  

public static void Shuffle<T>(this IList<T> list)
{
int n = list.Count;
while (n > 1) {
n--;
int k = rng.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}

Usage:

List<Product> products = GetProducts();
products.Shuffle();

The code above uses the much criticised System.Random method to select swap candidates. It's fast but not as random as it should be. If you need a better quality of randomness in your shuffles use the random number generator in System.Security.Cryptography like so:

using System.Security.Cryptography;
...
public static void Shuffle<T>(this IList<T> list)
{
RNGCryptoServiceProvider provider = new RNGCryptoServiceProvider();
int n = list.Count;
while (n > 1)
{
byte[] box = new byte[1];
do provider.GetBytes(box);
while (!(box[0] < n * (Byte.MaxValue / n)));
int k = (box[0] % n);
n--;
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}

A simple comparison is available at this blog (WayBack Machine).

Edit: Since writing this answer a couple years back, many people have commented or written to me, to point out the big silly flaw in my comparison. They are of course right. There's nothing wrong with System.Random if it's used in the way it was intended. In my first example above, I instantiate the rng variable inside of the Shuffle method, which is asking for trouble if the method is going to be called repeatedly. Below is a fixed, full example based on a really useful comment received today from @weston here on SO.

Program.cs:

using System;
using System.Collections.Generic;
using System.Threading;

namespace SimpleLottery
{
class Program
{
private static void Main(string[] args)
{
var numbers = new List<int>(Enumerable.Range(1, 75));
numbers.Shuffle();
Console.WriteLine("The winning numbers are: {0}", string.Join(", ", numbers.GetRange(0, 5)));
}
}

public static class ThreadSafeRandom
{
[ThreadStatic] private static Random Local;

public static Random ThisThreadsRandom
{
get { return Local ?? (Local = new Random(unchecked(Environment.TickCount * 31 + Thread.CurrentThread.ManagedThreadId))); }
}
}

static class MyExtensions
{
public static void Shuffle<T>(this IList<T> list)
{
int n = list.Count;
while (n > 1)
{
n--;
int k = ThreadSafeRandom.ThisThreadsRandom.Next(n + 1);
T value = list[k];
list[k] = list[n];
list[n] = value;
}
}
}
}

Given a list of length n select k random elements using C#

I would suggest simply shuffling elements as if you were writing a modified Fisher-Yates shuffle, but only bother shuffling the first k elements. For example:

public static void PartialShuffle<T>(IList<T> source, int count, Random random)
{
for (int i = 0; i < count; i++)
{
// Pick a random element out of the remaining elements,
// and swap it into place.
int index = i + random.Next(source.Count - i);
T tmp = source[index];
source[index] = source[i];
source[i] = tmp;
}
}

After calling this method, the first count elements will be randomly picked elements from the original list.

Note that I've specified the Random as a parameter, so that you can use the same one repeatedly. Be careful about threading though - see my article on randomness for more information.

How to access random item in list?

  1. Create an instance of Random class somewhere. Note that it's pretty important not to create a new instance each time you need a random number. You should reuse the old instance to achieve uniformity in the generated numbers. You can have a static field somewhere (be careful about thread safety issues):

    static Random rnd = new Random();
  2. Ask the Random instance to give you a random number with the maximum of the number of items in the ArrayList:

    int r = rnd.Next(list.Count);
  3. Display the string:

    MessageBox.Show((string)list[r]);


Related Topics



Leave a reply



Submit