Roulette Wheel Selection

As we saw previously, it’s easy to simulate a random coin flip by generating a random decimal $r$ between $0$ and $1$ and applying the following function:

$\begin{align*} \textrm{outcome}(r) = \begin{cases} \textrm{heads} & \textrm{ if } r < 0.5 \\ \textrm{tails} & \textrm{ otherwise} \end{cases} \end{align*}$


This is a special case of a more general idea: sampling from a discrete probability distribution. Flipping a fair coin is tantamount to sampling from the distribution [0.5, 0.5], i.e. $0.5$ probability heads and $0.5$ probability tails.

More complicated contexts may require sampling from longer distributions that may or may not be uniform. For example, if we wish to simulate the outcome of rolling a die with two red faces, one blue face, one green face, and one yellow face, then we need to sample from the distribution [0.4, 0.2, 0.2, 0.2].

Note that when we sample from the distribution, we need only sample an index from the distribution, and then use the index to look up the desired value. For example, when we sample an index from the distribution [0.4, 0.2, 0.2, 0.2], we have probabilities $0.4,$ $0.2,$ $0.2,$ and $0.2$ of getting indices $0,$ $1,$ $2,$ and $3$ respectively. Then, all we need to do is look up that index in the following array: [red, blue, green, yellow].

Roulette Wheel Selection

Roulette wheel selection is an elegant technique for sampling an index from an arbitrary discrete probability distribution. It works as follows:

  1. turn the distribution into a cumulative distribution,
  2. sample a random number $r$ between $0$ and $1,$ and then
  3. find the index of the last value in the cumulative distribution that is less than $r.$

To illustrate, let’s sample an index from the distribution [0.4, 0.2, 0.2, 0.2] that was mentioned above in the context of a die roll.

  1. First, we construct the cumulative distribution: [0.4, 0.4 + 0.2, 0.4 + 0.2 + 0.2, 0.4 + 0.2 + 0.2 + 0.2], or more simply, [0.4, 0.6, 0.8, 1.0].
  2. Then, we sample a random number $r$ between $0$ and $1.$ Suppose we get $r = 0.93.$
  3. Finally, we find the index of the last value in the cumulative distribution that is less than $r = 0.93.$ In our case, this is the value $0.8$ at index $2,$ because the next value ($1.0$ at index $3$) is greater than $r = 0.93.$

So, we have sampled the index $2.$

Exercise

Write a function random_draw(distribution) that samples a random number from the given distribution, an array where distribution[i] represents the probability of sampling index i.

To test your function on a particular distribution, sample many indices from the distribution and ensure that the proportion of times each index gets sampled matches the corresponding probability in the distribution. Do this for a handful of different distributions.

Leave a Comment