Something a little lighter today: a probability puzzle. Consider this snippet of code:

```
def roll(a,b):
# Assume that a and b are both integers greater than 0
return random.randint(0,a-1) > random.randint(0,b-1)
```

What is the probability that `roll()`

returns `True`

for any given `a`

and `b`

? The answer to this turns out to be a piecewise function, which we derive below.

### Analysis

What, exactly, is the `roll()`

function doing? Well, it’s generating a number (let’s call it X) in the range [0, a) and comparing it to a second generated number (let’s call it Y) in the range [0, b). The `roll()`

function returns `True`

iff X is greater than Y.

### Decomposition

Let’s begin by finding the probability that the function will return `True`

for any particular X; let’s call this probability P(X). P(X) is equal to the number of possible Ys less than X divided by the total number of possible Ys. The denominator of this expression is simple enough: There are `b`

possible Ys. The numerator is a little more complicated: It is the *lesser* of X and `b`

. Therefore:

`P(X) = min(X, b)/b`

The logic behind this expression’s numerator is that while there are X integers in the range [0, X), not all of them are necessarily in the range [0, b); `b`

may be less than X.

### Summation

The probability that `roll()`

returns `True`

is equal to the sum, over every possible X, of the product of the probability of generating that X, and P(X). The probability of generating any particular X is just 1/a, so the overall probability that `roll()`

returns `True`

is the sum of P(X)/a for all X in the range [0, a).

### Splitting the Sum

Since P(X) has that awkward min in it, it makes sense, for certain `a`

and `b`

, to split this summation into two parts: a sum of those P(X)/a in the range [0, b), for which P(X) equals X/b, and a sum of those P(X)/a in the range [b, a), for which P(X) equals 1. Note that this split only makes sense when `a`

is greater than `b`

.

The terms in the first part of the split sum are of the form X/ab; this makes the sum equal to 1/ab multiplied by the sum of the X in the range [0, b), which works out to (b-1)(b)/2. The overall value of the first part of the split sum is therefore (b-1)/2a.

The terms in the second part of the split sum are of the form 1/a; this makes the sum equal to 1/a multiplied by the number of terms in the range [b, a), which works out to (a-b). The overall value of the second part of the split sum is therefore (a-b)/a.

Adding the first and second parts of the split sum together yields the total (2a-b-1)/2a.

### Piecewise

As mentioned above, it doesn’t make sense to split the sum when `a`

is less than or equal to `b`

; in that case, the second sum contains no terms, and so we must evaluate the sum of the P(X)/a terms, where P(X) equals X/b, over the range [0, a).

The terms in this sum are of the form X/ab; this makes the sum equal to 1/ab multiplied by the sum of the X in the range [0, a), which works out to (a-1)(a)/2. The overall value of the sum is therefore (a-1)/2b.

The split and non-split forms of the sum are not (in general) equal; this means that the overall probability that `roll()`

will return `True`

for a given `a`

and `b`

is represented by a piecewise 2-argument function:

```
P(a,b):
if (a > b): (2a-b-1)/2a
o/w: (a-1)/2b
```

Pingback: Things that were not immediately obvious to me » Blog Archive » Vacation