Synchronization Efficiency Proof

Last Friday I described an algorithm that would generate lists of insert and delete operations which would transform a subject list s.t. it would match a target list. I wanted the algorithm to be efficient, both in terms of its own runtime, and in terms of the operations it produced.

I concluded that post by writing:

I’m not going to present an argument that the operation set produced by this algorithm is really minimal, because that doesn’t seem like it would be fun.

Well, I can’t think of a good topic for today, so I’m going to present a stab at the proof. (For the curious: No, it wasn’t fun.)

Algorithm

To review, here’s the algorithm. It returns a list of inserts `i`, and a list of deletes `d`, which when applied to the `a` argument will transform it s.t. it is equal to the `b` argument.

``````def sync_ordered_list(a, b):
x = 0; y = 0; i = []; d = []
while (x < len(a)) or (y < len(b)):
if y >= len(b): d.append(x); x += 1
elif x >= len(a): i.append((y, b[y])); y += 1
elif a[x] < b[y]: d.append(x); x += 1
elif a[x] > b[y]: i.append((y, b[y])); y += 1
else: x += 1; y += 1
return (i,d)``````

Lemma 1

Lemma: Let the value at `a[x]` be known as `A`. If `x` is added to `d` on one iteration of the loop, then it is impossible for an insert with value `A` to be added to `i` on a subsequent iteration.

Proof: Assume that `x` is added to `d` when `x` is `X` and `y` is `Y`. This implies that either `Y >= len(b)` or `A < b[Y]`. Any subsequent insert added to `i` would have to have its value taken from `b[z]`, where `z >= Y`. Therefore, if a subsequent insert had the value `A`, there would have to be a `z` s.t.:

• `z >= Y`, and
• `z < len(b)`, and
• `b[z] == A`

We know that at least one of these conditions must be false, since either `Y >= len(b)` or `A < b[Y]`. If `Y >= len(b)`, then `z >= Y` implies that `z >= len(b)`, and the second condition is therefore false. If `A < b[Y]`, then `z >= Y` implies that `A < b[z]`, since `b` is ordered, and the third condition is therefore false. Since no `z` exists that meets all three conditions, there can be no insert with value `A` added to `i` on a subsequent iteration.

Lemma 2

Lemma: Let the value at `b[y]` be known as `B`. If the insert `(y, B)` is added to `i` on one iteration of the loop, then it is impossible for a delete with value `z`, where `a[z] == B`, to be added to `d` on a subsequent iteration.

Proof: Assume that `(y, B)` is added to `i` when `x` is `X` and `y` is `Y`. This implies that either `X >= len(a)` or `a[X] > B`. Any subsequent delete added to `d` would have to be at position z, where `z >= X`. Therefore, if a subsequent delete were of value `B`, there would have to be a `z` s.t.:

• `z >= X`, and
• `z < len(a)`, and
• `a[z] == B`

We know that at least one of these conditions must be false, since either `X >= len(a)` or `a[X] > B`. If `X >= len(a)`, then `z >= X` implies that `z >= len(a)`, and the second condition is therefore false. If `a[X] > B`, then `z >= X` implies that `a[z] > B`, since `a` is ordered, and the third condition is therefore false. Since no `z` exists that meets all three conditions, there can be no delete with value `z`, where `a[z] == B`, added to `d` on a subsequent iteration.

Lemma 3

Lemma: It follows from Lemma 1 and Lemma 2 that no single value may be both deleted from the initial list and inserted into the final list.

Proof: A value `V` could be both inserted and deleted only in one of the following two cases. (For both cases, assume that `a[X] == b[Y] == V` for some `X`, `Y`, and `V`.)

• On one iteration of the loop `X` is added to `d`, and on a subsequent iteration of the loop an insert with value `V` is added to `i`. Note that this is prohibited by Lemma 1.
• On one iteration of the loop `(Y, V)` is added to `i`, and on a subsequent iteration of the loop a delete with value `X` is added to `d`. Note that this is prohibited by Lemma 2.

Since there’s no possible way for any value `V` to be both inserted and deleted, it follows that any given value might be inserted, or deleted, but not both.

Proof of Minimalism

Each insert or delete operation increases or decreases the count of some value in the list on which it is performed. (For instance, if you delete the value at position 2 in the list `[5, 6, 6, 7]`, then you’ve reduced the count of 6’s by 1.) Since we know from Lemma 3 that any given value is (at most) only inserted or deleted by our algorithm, we know that the number of operations on a particular value is equal to the absolute value of the difference between the count of that value in `a`, and the count of that value in `b`.

We also know that the minimum number of operations on a particular value is equal to the absolute value of the difference between the count of that value in `a`, and the count of that value in `b`. So, our algorithm generates the minimum number of operations on each value that occurs in `a` or `b`. Therefore, it also generates the minimum total number of operations. QED.

Share and Enjoy:
This entry was posted in Python. Bookmark the permalink.