Algorithm to Shuffle an Array Randomly Based on Different Weights

Algorithm to shuffle an Array randomly based on different weights

I can think of two approaches to solve it, though my gut tells me there should be modification for Fisher-Yates to achieve it even better:

O(n*W) solution: (simple to program)

First approach, create duplicates according to the weight (same as your approach), and populate a new list. Now run a standard shuffle (fisher-yates) on this list. Iterate the list and discard all duplicates, and keep only the first occurance of each element. This runs in O(n*W), where n is the number of elements in the list, and W is the average weight (pseudo-polynomial solution).

O(nlogn) solution: (significantly harder to program)

Second approach would be to create a list of sum of weights of the elements:

sum[i] = weight[0] + ... + weight[i]

Now, draw a number, between 0 to sum[n], and chose the first element whose sum is greater/equals this random number.

This will be the next element, discard the element, recreate the list, and repeat.

This runs in O(n^2*logn)

It can be further enhanced by creating a binary tree rather than a list, where each node also stores the value of weights of the entire subtree.

Now, after choosing an element, find the matching element (whose sum up to him is the first one higher than the random selected number), delete the node, and recalculate the weights on the path to the route.

This will take O(n) to create the tree, O(logn) to find the node at each step, and O(logn) to recalculate the sum. Repeat it until the tree is exhausted, and you get O(nlogn) solution.

The idea of this approach is very similar to Order Statistics Trees, but using the sum of weights rather than the number of descendants. The search and balancing after deletion will be done simiarly to order statistics tree.

Explanation to constructing and using the binary tree.

Assume you have elements=[a,b,c,d,e,f,g,h,i,j,k,l,m] with weights=[1,2,3,1,2,3,1,2,3,1,2,3,1]

First construct an almost full binary tree, and populate the elements in it. Note that the tree is NOT Binary search tree, just a regular tree, so order of elements does not matter - and we won't need to maintain it later on.

You will get something like the following tree:

Sample Image

Legend: w - weight of that node, sw - sum of weight for the entire subtree.

Next, calculate sum of weights for each subtree. Start from the leaves, and calculate s.w = w. For every other node calculate s.w = left->s.w + right->s.w, filling the tree from the bottom up (post order traversal).

Sample Image

Building the tree, populating it, and calculating s.w. for each node is done in O(n).

Now, you iteratively need to chose a random number between 0 to sum of weights (the s.w. value of the root, in our case 25). Let that number be r, and find for each such number the matching node.

Finding the matching node is done recursively

if `r< root.left.sw`:
   go to left son, and repeat. 
else if `r<root.left.sw + root.w`:
   the node you are seeking is the root, choose it. 
else:
   go to `root.right` with `r= r-root.left.sw - root.w`

Example, chosing r=10:

Is r<root.left.sw? Yes. Recursively invoke with r=10,root=B (left child)
Is r<root.left.sw No. Is r < root.left.sw + root.w? No. Recursively invoke with r=10-6-2=2, and root=E (right chile)
Is r<root.left.sw? No. Is r < root.left.sw + root.w? Yes. Choose E as next node.

This is done in O(h) = O(logn) per iteration.

Now, you need to delete that node, and reset the weights of the tree.

One approach to deleting that ensures the tree is in logarithmic weight is smilar to binary heap: Replace the chosen node with the bottom rightest node, remove the new rightest bottom node, and rebalance the two branches going from the two involved nodes to the tree.

First switch:

Sample Image

Then recalculate:

Sample Image

Note that recalculation is needed only to two paths, each of depth at most O(logn) (the nodes colored orange in the pic), so deletion and recalculation is also O(logn).

Now, you got yourself a new binary tree, with the modified weights, and you are ready to choose the next candidate, until the tree is exhausted.

Randomly shuffle a weighted array

Here's a most likely inefficient but hopefully effective enough solution:
(Although I make no promises about correctness! Plus the code isn't going to make too many Rubyists happy...).

The essence of the algorithm is as simple as picking an element randomly based on its weight, removing it, and then repeating with the remaining elements.

def shuffle some_hash
   result = []

   numbers = some_hash.keys
   weights = some_hash.values
   total_weight = weights.reduce(:+)

   # choose numbers one by one
   until numbers.empty?
      # weight from total range of weights
      selection = rand() * total_weight

      # find which element this corresponds with
      i = 0
      while selection > 0
         selection -= weights[i]
         i += 1
      end
      i -= 1

      # add number to result and remove corresponding weight
      result << numbers[i]
      numbers.delete_at i
      total_weight -= weights.delete_at(i)
   end

   result
end

Weighted Shuffle of an Array or Arrays?

I was able to solve this problem like so:

function compare($a, $b)
{
  $share_of_a = $a['rank'];
  $share_of_b = $b['rank'];
  return mt_rand(0, ($share_of_a+$share_of_b)) > $share_of_a ? 1 : -1;
}

usort($array, "compare"); // Sort the array using the above compare function when comparing
$array = array_reverse($array);

Weighted random selection from array

Compute the discrete cumulative density function (CDF) of your list -- or in simple terms the array of cumulative sums of the weights. Then generate a random number in the range between 0 and the sum of all weights (might be 1 in your case), do a binary search to find this random number in your discrete CDF array and get the value corresponding to this entry -- this is your weighted random number.

Shuffle elements of an array/n numbers uniformly randomly. Possibley in expected O(n) time

You should look at the Fisher-Yates shuffle.

From the article:

Properly implemented, the Fisher–Yates
shuffle is unbiased, so that every
permutation is equally likely. The
modern version of the algorithm is
also rather efficient, requiring only
time proportional to the number of
items being shuffled and no additional
storage space.

So it meets your requirements. It's pretty easy to implement too.

C++. Weighted std::shuffle

If OP intent is to shuffle a list r of items

such that, given a list of weights w, the element a[i] with weight w[i] should be the first element of the random shuffle r with probability w[i]/sum(w).

As stated in the page linked by Severin Pappadeux:

Weighted random shuffling is the same as weighted random sampling from a list a without replacement. That is, choose with probability w[i]/sum(w) element a[i] from a. Store this element in a list r. Then, remove element a[i] from a and w[i] from w, and select a new element of the modified list a, and so on until a is empty.

I'am not aware of such an algorithm in the Standard Library, but a simple implementation could be:

#include <random>
#include <algorithm>
#include <iterator>

template <class D, class W, class URBG>
void weighted_shuffle
    ( D first, D last
    , W first_weight, W last_weight
    , URBG&& g )
{
    while (first != last and first_weight != last_weight)
    {
        std::discrete_distribution dd(first_weight, last_weight);
        auto i = dd(g);
        if ( i )
        {
            std::iter_swap(first, std::next(first, i));
            std::iter_swap(first_weight, std::next(first_weight, i));
        }
        ++first;
        ++first_weight;
    }
}

Live example HERE.

How to randomize (shuffle) a JavaScript array?

The de-facto unbiased shuffle algorithm is the Fisher-Yates (aka Knuth) Shuffle.

You can see a great visualization here (and the original post linked to this)

function shuffle(array) {
  let currentIndex = array.length,  randomIndex;

  // While there remain elements to shuffle.
  while (currentIndex != 0) {

    // Pick a remaining element.
    randomIndex = Math.floor(Math.random() * currentIndex);
    currentIndex--;

    // And swap it with the current element.
    [array[currentIndex], array[randomIndex]] = [
      array[randomIndex], array[currentIndex]];
  }

  return array;
}

// Used like so
var arr = [2, 11, 37, 42];
shuffle(arr);
console.log(arr);

Weighted Randomized Ordering

This seems to work fine:

// Can do a weighted sort on weighted items.
public interface Weighted {
    int getWeight();
}

/**
 * Weighted sort of an array - orders them at random but the weight of each
 * item makes it more likely to be earlier.
 *
 * @param values
 */
public static void weightedSort(Weighted[] values) {
    // Build a list containing as many of each item to make up the full weight.
    List<Weighted> full = new ArrayList<>();
    for (Weighted v : values) {
        // Add a v weight times.
        for (int i = 0; i < v.getWeight(); i++) {
            full.add(v);
        }
    }
    // Shuffle it.
    Collections.shuffle(full);
    // Roll them out in the order required.
    int i = 0;
    do {
        // Get the first one in the shuffled list.
        Weighted next = full.get(0);
        // Put it back into the array.
        values[i++] = next;
        // Remove all occurrences of that one from the list.
        full.remove(next);
    } while (!full.isEmpty());
}

// A bunch of weighted items.
enum Heavies implements Weighted {

    Rare(1),
    Few(3),
    Common(6);
    final int weight;

    Heavies(int weight) {
        this.weight = weight;
    }

    @Override
    public int getWeight() {
        return weight;
    }
}

public void test() {
    Weighted[] w = Heavies.values();
    for (int i = 0; i < 10; i++) {
        // Sort it weighted.
        weightedSort(w);
        // What did we get.
        System.out.println(Arrays.toString(w));
    }
}

Essentially for each item to be sorted I add it as many times as needed to a new list. I then shuffle the list and pull the top one out and clar all occurrences of it from the remaining.

The last test run produced:

[Rare, Common, Few]
[Common, Rare, Few]
[Few, Common, Rare]
[Common, Few, Rare]
[Common, Rare, Few]
[Few, Rare, Common]

which seems to be about right.

NB - this algorithm will fail under at least the following conditions:

The original array has the same object in it more than once.
The weights of the items are insanely huge.
Zero or negative weights will almost certainly mess with the results.

Added

This implements Rossum's idea - please be sure to give him the credit for the algorithm.

public static void weightedSort2(Weighted[] values) {
    // Calculate the total weight.
    int total = 0;
    for (Weighted v : values) {
        total += v.getWeight();
    }
    // Start with all of them.
    List<Weighted> remaining = new ArrayList(Arrays.asList(values));
    // Take each at random - weighted by it's weight.
    int which = 0;
    do {
        // Pick a random point.
        int random = (int) (Math.random() * total);
        // Pick one from the list.
        Weighted picked = null;
        int pos = 0;
        for (Weighted v : remaining) {
            // Pick this ne?
            if (pos + v.getWeight() > random) {
                picked = v;
                break;
            }
            // Move forward by that much.
            pos += v.getWeight();
        }
        // Removed picked from the remaining.
        remaining.remove(picked);
        // Reduce total.
        total -= picked.getWeight();
        // Record picked.
        values[which++] = picked;
    } while (!remaining.isEmpty());
}

Algorithm to Shuffle an Array Randomly Based on Different Weights