# Probability

Something a little lighter today: a probability puzzle. Consider this snippet of code:

``````def roll(a,b):
# Assume that a and b are both integers greater than 0
return random.randint(0,a-1) > random.randint(0,b-1)``````

What is the probability that `roll()` returns `True` for any given `a` and `b`? The answer to this turns out to be a piecewise function, which we derive below.

### Analysis

What, exactly, is the `roll()` function doing? Well, it’s generating a number (let’s call it X) in the range [0, a) and comparing it to a second generated number (let’s call it Y) in the range [0, b). The `roll()` function returns `True` iff X is greater than Y.

### Decomposition

Let’s begin by finding the probability that the function will return `True` for any particular X; let’s call this probability P(X). P(X) is equal to the number of possible Ys less than X divided by the total number of possible Ys. The denominator of this expression is simple enough: There are `b` possible Ys. The numerator is a little more complicated: It is the lesser of X and `b`. Therefore:

``P(X) = min(X, b)/b``

The logic behind this expression’s numerator is that while there are X integers in the range [0, X), not all of them are necessarily in the range [0, b); `b` may be less than X.

### Summation

The probability that `roll()` returns `True` is equal to the sum, over every possible X, of the product of the probability of generating that X, and P(X). The probability of generating any particular X is just 1/a, so the overall probability that `roll()` returns `True` is the sum of P(X)/a for all X in the range [0, a).

### Splitting the Sum

Since P(X) has that awkward min in it, it makes sense, for certain `a` and `b`, to split this summation into two parts: a sum of those P(X)/a in the range [0, b), for which P(X) equals X/b, and a sum of those P(X)/a in the range [b, a), for which P(X) equals 1. Note that this split only makes sense when `a` is greater than `b`.

The terms in the first part of the split sum are of the form X/ab; this makes the sum equal to 1/ab multiplied by the sum of the X in the range [0, b), which works out to (b-1)(b)/2. The overall value of the first part of the split sum is therefore (b-1)/2a.

The terms in the second part of the split sum are of the form 1/a; this makes the sum equal to 1/a multiplied by the number of terms in the range [b, a), which works out to (a-b). The overall value of the second part of the split sum is therefore (a-b)/a.

Adding the first and second parts of the split sum together yields the total (2a-b-1)/2a.

### Piecewise

As mentioned above, it doesn’t make sense to split the sum when `a` is less than or equal to `b`; in that case, the second sum contains no terms, and so we must evaluate the sum of the P(X)/a terms, where P(X) equals X/b, over the range [0, a).

The terms in this sum are of the form X/ab; this makes the sum equal to 1/ab multiplied by the sum of the X in the range [0, a), which works out to (a-1)(a)/2. The overall value of the sum is therefore (a-1)/2b.

The split and non-split forms of the sum are not (in general) equal; this means that the overall probability that `roll()` will return `True` for a given `a` and `b` is represented by a piecewise 2-argument function:

``````P(a,b):
if (a > b):	(2a-b-1)/2a
o/w:		(a-1)/2b``````
This entry was posted in Python. Bookmark the permalink.