kuniga.me > NP-Incompleteness > Eliminating Amortization

09 Jul 2017

In this chapter Okasaki presents techniques to convert persistent data structures with amortized bounds to worst-case bounds. A few reasons to want worst-case guarantees include:

- Real-time systems, where an expensive operation can cause the system to miss a hard deadline, even if it’s efficient when amortized over time.
- Parallel systems, in which we want avoid one processor to take more time than the other if it happens to execute the expensive operation.

The intuition is to make the theoretical amortized analysis part of the actual implementation. In the amortized analysis, we saw either through the *Banker’s method* or the *Physicist method* the idea of paying debt ahead of time so by the time an expensive operation takes place, we could show it’s cost had already been distributed throughout previous operations.

To eliminate the amortization, we can literally pay off these costs when running the algorithm. We do it through a structure called **schedule**, which contains the unevaluated operations on the data structure. Whenever we perform an operation, we evaluate one or more items of the schedule. Due to memoization, by the time we actually need the evaluated structure, it will be evaluated.

Often the amortized analysis can dictate how the execution of the schedule is performed. For example, if the analysis tells us to pay off 2 units of debt on every action, that can be translated to executing 2 items from the schedule on every action.

The main challenge in this conversion is to modify the data structure in such a way it can be partially executed.

We’ll discuss an example using queues and then binomial heaps.

For the queue structure, the costly operation is the rotation that takes place when we need to combine the rear with the front. It’s a monolithic operation, so we need to change that.

Let’s start by defining the structure. It’s similar to the stream queue, except that we don’t need the rear to be a stream and we have a `schedule`

field, which is also a stream. The idea is that whenever a rotation happens, it will be “scheduled” to be executed little by little.

The `rotation()`

function is the most complicated part of the structure. The invariant is that we only call this function when `|rear| = |front| + 1`

.

We construct a stream with the elements of rear reversed, `newSchedule`

and on the return step of the recursion we append the elements of `front`

to that stream.

Note that this is performed lazily, so a call to rotate only executes one step.

Now we have the execution of the schedule. It serves two purposes. One is to execute an item from the schedule and the other is to trigger a rotation when the is schedule empty.

The schedule being empty is a proxy to when `|rear| = |front| + 1`

. Why? Because when the schedule is populated, it has the same size of front (they’re both assigned the same stream), and rear is empty. Whenever we insert an element, increasing the size of rear by one, or remove an element, reducing the size of front by one, we decrease the difference between `|front| - |rear|`

by 1, and so the size of the schedule is decreased by 1.

Implementation-wise, maybe it would be more clear if we checked for the empty schedule outside and only called `exec()`

is it was not empty.

With these in place, the usual operations for queue are straightforward. The ones that mutate the queue, `push()`

and `pop()`

, need to make sure to call `exec()`

.

We discussed Binomial Heaps in a previous post. The idea is that a binomial heap is a list of binomial trees, an a binomial tree is defined recursively based on it’s rank.

The heap is represented by a binary number (as a list of digits). If the k-th digit of the number is 1, it indicates the existence of a binomial tree of rank k (containing 2^(k-1)). A heap with `n`

elements, has a unique representation, which is the binary representation of `n`

.

For example, if `n = 10`

, then the binary representation of the heap is `1010`

, meaning it contains a binomial tree of rank 2 (2 elements), and one with rank 4 (8 elements), holding 10 elements in total.

Inserting an element into the heap is analogous to incrementing a binary number by 1. We first create a binomial tree with rank 0 containing that element, and try to insert it into the beginning of the digits list. If the position we want to insert is occupied, we need to merge the trees, to generate a tree with a higher rank, which is analogous to the carry over operation when adding two binary numbers.

The schedule is a list of all the numbers generated when any operation is performed.

The structure for the heap is then the following:

As we discussed above, the insertion is analogous to incrementing the binary number. But because the digits are a stream, we need to deal with lazy evaluation:

Executing the schedule consists in evaluating one digit from the first number on the schedule. The key is to avoid evaluating the part of the number that already has been evaluated. It’s possible to show this happens when we find the first digit one. The intuition is that the trailing zeros from the current number were 1’s before the increment, so there was a mutation (linking) while the remaining digits were not modified by that operation.

For example, if we have the number `101011`

, an increment causes it to become `101100`

. The two trailing zeros in `101100`

correspond to a linking of the binomial tree.

Finally, inserting a new element consists in creating a binomial tree of ranking 0, insert it, and then execute the schedule.

The full code is available on github.

Because we now have several different implementations of queues, I wanted to implement tests that were applicable to any queue implementing an “interface”.

One way to do that is to put the interface in a separate module, IQueue.ml and have each implementation require it as their module type:

To make the test work with any module implementing the `IQueue`

interface, we can use a functor (module transformation) and

Other things I’ve learned were matching lazy patterns. Matching a lazily evaluated variable with the keyword `lazy`

forces the evaluation, so we can use a cleaner syntax, for example:

Another syntactic construct that helps with legibility is the record. The examples in Okazaki’s book use tuples for most of composed structs, but I prefer the more verbose and explicit records.

Finally, another lesson learned is that adding an insertion operation to the `Stream`

API is likely wrong. The idea of the stream is that is avoids evaluation of its tail, so the whole block has to lazily evaluated.

For example, in the queue implementation, in the first block, we will evaluate all the rotation and then make the result lazy, which is not what we want.

Eliminating evaluation is a very interesting technique, but it has limited application is practice. It complicates the implementation and, as I learned, it can be tricky to spot bugs (for example, the one in which we’re evaluating the `rotate()`

function) that would only show up if we profile the data structure.