# Roulette Wheel Selection

*How to sample from a discrete probability distribution.*

*This post is part of the book Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents.* *Suggested citation:* Skycak, J. (2021). Roulette Wheel Selection. In *Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents.* https://justinmath.com/roulette-wheel-selection/

As we saw previously, it’s easy to simulate a random coin flip by generating a random decimal $r$ between $0$ and $1$ and applying the following function:

This is a special case of a more general idea: sampling from a discrete probability distribution. Flipping a fair coin is tantamount to sampling from the distribution `[0.5, 0.5]`

, i.e. $0.5$ probability heads and $0.5$ probability tails.

More complicated contexts may require sampling from longer distributions that may or may not be uniform. For example, if we wish to simulate the outcome of rolling a die with two red faces, one blue face, one green face, and one yellow face, then we need to sample from the distribution `[0.4, 0.2, 0.2, 0.2]`

.

Note that when we sample from the distribution, we need only sample an *index* from the distribution, and then use the index to look up the desired value. For example, when we sample an index from the distribution `[0.4, 0.2, 0.2, 0.2]`

, we have probabilities $0.4,$ $0.2,$ $0.2,$ and $0.2$ of getting indices $0,$ $1,$ $2,$ and $3$ respectively. Then, all we need to do is look up that index in the following array: `[red, blue, green, yellow]`

.

## Roulette Wheel Selection

**Roulette wheel selection** is an elegant technique for sampling an index from an arbitrary discrete probability distribution. It works as follows:

- turn the distribution into a cumulative distribution,
- sample a random number $r$ between $0$ and $1,$ and then
- find the index of the first value in the cumulative distribution that is greater than or equal to $r.$

To illustrate, let’s sample an index from the distribution `[0.4, 0.2, 0.2, 0.2]`

that was mentioned above in the context of a die roll.

- First, we construct the cumulative distribution:
`[0.4, 0.4 + 0.2, 0.4 + 0.2 + 0.2, 0.4 + 0.2 + 0.2 + 0.2]`

, or more simply,`[0.4, 0.6, 0.8, 1.0]`

. - Then, we sample a random number $r$ between $0$ and $1.$ Suppose we get $r = 0.63.$
- Finally, we find the index of the first value in the cumulative distribution that is greater than or equal to $r = 0.63.$ In our case, this is the value $0.8$ at index $2,$ because the next value ($1.0$ at index $3$) is greater than $r = 0.63.$

So, we have sampled the index $2.$

## Exercise

Write a function `random_draw(distribution)`

that samples a random number such that `distribution[i]`

is the probability of sampling index `i`

.

To test your function on a particular distribution, sample many indices from the distribution and ensure that the proportion of times each index gets sampled matches the corresponding probability in the distribution. Do this for a handful of different distributions.

*This post is part of the book Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents.* *Suggested citation:* Skycak, J. (2021). Roulette Wheel Selection. In *Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents.* https://justinmath.com/roulette-wheel-selection/