Amortization
Implementations with good amortized bounds are often simpler and faster than implementations with comparable worst-case bounds. Given a sequence of operations, we may wish to know the running time of the entire sequence, but not care about the running time of any individual operation.
For instance, given a sequence of n operations, we may wish to bound the total running time of the sequence by O(n) without insisting than every individual operation run in O(1) time. We might be satisfied if a few operations run in O(log n) or even O(n) time, provided the total cost of the sequence is only O(n). This freedom opens up a wide design space of possible solutions, and often yields new solutions that are simpler and faster than worst-case solutions with equivalent bounds.
To provide an amortized bound, one defines the amortized cost of each operation and then proves that, for any sequence of operations, the total amortized cost of the operations in an upper bound on the total actual cost, $$ \sum_{i=1}^m a_i \geq \sum_{i=1}^m t_i $$ where $ a_i $ is the amortized cost of operation i, $t_i$ is the actual cost of operation i, and m is the total number of operations. Usually, in fact, one proves a slightly strong result: that at any intermediate stage in a sequence of operations, the accumulated amortized cost is an upper bound on the accumulated actual cost, $$ \sum_{i=1}^j a_i \geq \sum_{i=1}^j t_i $$ for any j. The difference between the accumulated amortized costs and the accumulated actual costs is called the accumulated savings. Thus, the accumulated amortized costs are an upper bound on the accumulated actual costs whenever the accumulated saving is non-negative.
Amortization allows for occasional operations to have actual costs that exceed their amortized costs, such operations are called expensive, while the operations whose actual costs are less than their amortized costs are called cheap. It’s easy to see that expensive operations decrease the accumulated saving and cheap operations increase it. The key to proving amortized bounds is to show that expensive operations occur only when the accumulated saving are sufficient to cover the remaining cost.
There are two techniques for analyzing amortized data structures: the banker’s method and the physicist’s method. In the banker’s method, the accumulated saving are represented as credits that are associated with individual locations in the data structure. These credits are used to pay for future accesses to these locations. The amortized cost of any operation is defined to be the actual cost of the operation plus the credits allocated by the operation minus the credits spent by the operation: $$ a_i = t_i + c_i - \bar{c_i} $$ where $ c_i $ is the number of credits allocated by operation i and $\bar{c_i}$ is the number of credits spent by operation i. Every credit must be allocated before it is spent, and no credit may be spent more than once. Therefore, $\sum c_i \geq \sum \bar{c_i}$ which in turn guarantees that $\sum a_i \geq \sum t_i$, as desired. Proofs using the banker’s method typically define a credit invariant that regulates the distribution of credits in such a way that, whenever an expensive operation might occur, sufficient credits have been allocated in the right locations to cover its cost.
In the physicist’s method, one describes a function $\Phi$ that maps each object d to a real number called the potential of d. The function $\Phi$ is typically chosen so that the potential is initially zero and is always non-negative. Then the potential represents a lower bound on the accumulated savings.
Let d_i be the output of operation i and input of operation i + 1. Then the amortized cost of operation i is defined to be the actual cost plus the change in potential between $d_{i-1}$ and $d_i$: $$ a_i = t_i + \Phi(d_i) + \Phi(d_{i-1}) $$ so, the accumulated actual costs of the sequence of operations are: $$ \sum_{i=1}^j t_i = \sum_{i=1}^j a_i + \Phi(d_0) - \Phi(d_j) $$ Provided $\Phi$ is chosen in such a way that $\Phi(d_0)$ is zero and $\Phi(d_j)$ is non-negative, then we conclude that the accumulated amoritized costs are an upper bound on the accumulated actual costs.
Queue
We illustrate the banker’s and physicist’s methods by analyzing a simple functional implementation of the FIFO queue abstraction as:
| |
The most common implementation of queues in a purely functional setting is a pair of lists, front contains the front elements of the queue in the correct order and rear contains the rear elements of the queue in reverse order:
type 'a t = 'a list * 'a list
in this representation, the head of the queue is the first element of front, and the last element of the queue is the first element of rear, so the remains can implement straightforward:
| |
Here we use the auxiliary function checkf so that we check if the front is empty and then reverse the rear as the new
front. Notice that, if front were empty when rear was not, then the first element of the queue would be the last
element of rear, which would take O(n) time to access. By maintaining this invariant, we guarantee that head can always
find the first element in O(1) time, and we also know that if the front is empty, then so is rear.
Now we show that snoc and tail both take O(1) amortized time using either the banker’s method or the physicist’s method,
though tail takes O(n) time in the worse-case. Using the banker’s method, we maintain a credit invariant that every element
in the real list is assiciated with a single credit. Every snoc into a non-empty queue takes one actual step and allocates
a credit to the new element of the rear list, for an amortized cost of two. Every tail that does not reverse the rear list,
takes one actual step and neither allocates nor spends any credits, for an amortized cost of one. Finally, every tail that
does reverse the rear list takes $ m+1 $ actual steps, where m is the length of the rear list, and spends the m credits
contained by that list, for an amortized cost of $ m + 1 - m = 1 $.
Using the physicist’s method, we define the potential function $\Phi$ to be the length of the rear list. Then every snoc into a non-empty queue takes one actual step and increases the potential by one, for an amortized cost of two. Every tail that does not reverse the rear list takes one actual step and leaves the potential unchanged, for an amortized cost of one. Finally, every tail that does reverse the rear list takes $m + 1$ actual steps and sets the new rear list to [], decreasing the potential by m, for an amortized cost of $ m+1-m = 1 $
Splay heaps
Splay trees are perhaps the most famous and successful of all amortized data structure. Splay trees are a close relative of balanced binary search trees, but they maintain no explicit balance information. Instead, every operation blindly restructures the tree using some simple transformations that tend to increase balance. Although any individual operation can take as much as O(n) time, but every operation runs in O(log n) amortized time. A major difference between splay trees and balanced binary search trees such as the red-black trees is that splay trees are restructured even during queries instead of only during updates. This property makes it awkward to use splay trees to implement abstractions such as sets or finite maps in a purely functional setting, because the query would have to return the new tree along with the answer. For some abstractions, however, the queries are limited enough to avoid these problems. A good example is the heap abstraction, where the only interesting query is findMin. In fact, splay tree make an excellent implementation of heaps.
The representation of splay tree in identical to that of unbalanced binary search trees:
| |
unlike the unbalanced binary search trees, we allow duplicate elements within a single tree. Unlike the insertion into ordinary binary search trees, we just partition the existing tree into two subtrees, one containing all the elements smaller than or equal to the new element and one in contrast, and then construct a new tree where the root is the new element and other two as its children:
| |
and bigger implement as bellow(smaller implement same way):
| |
Notice that we restructure the tree to make in more balanced: every time we follow two left branches in a row, we rotate those two nodes. Although, the tree may be still not balance in the usual sense, the new tree will be much more balanced than the original tree. In fact, this is the guiding principle of splay trees: search paths should be restructured to reduce the depth of every node in the path by about half.
Also we can combine bigger and smaller as one function partition, which return a pair:
| |
The remains are straightforward:
| |
And the amortized cost of insert and deleteMin run in O(log n) time, proofs are omited.
A particularly pleasant feature of splay trees is that they naturally adapt to any order that happens to be present in the input data. For example, using splay heaps to sort an already sorted list takes only O(n) time rather than O(nlog n) time. Leftist heaps also share this property, but only for decreasing sequences. Splay heaps excel on both increasing and decreasing sequences, as well as on sequence that are only partially sorted.
Pairing heaps
Pairing heaps are simple to implement and perform extremely well in parctice, but they have resisted analysis for over ten years. Pairing heaps are heap-ordered multiway trees, as defined:
| |
We allow only well-formed trees, in which Leaf never occurs in the child list of Node. Pairing heaps get their name
from deleteMin operation. deleteMin discards the root and then merges the children in two passes. The first pass merges
children in pairs from left to right. The second pass merges the resulting trees from right to left:
| |
and others are straightforward:
| |
Notice that findMin, insert and merge all run in O(1) worst-case time, however, deleteMin can take up to O(n) time
in the worst case. And deleteMin run in O(log n) amortized time.
All amortized data structures we have discussed are tremendously effective in pratice. Unfortunately, they perform a bad in persistence. Let q be the result of inserting n elements into an initially empty queue, so that the front list of q contains only a single element and the rear list contains n - 1 elements. So if we use q persistently by taking tail n times, each call takes n actual steps. And the total actual cost, including the time to build q, is $ n^2 + n $, Thus the operation can not take O(1) amortized time, and this can be solved via lazy evaluation.
Numerical Representations
Consider the usual representations of lists and natural numbers, along with several typical functions on each type:
| |
| |
Other than the fact that list contain elements and natural numbers do not, these implementations are virtually identical. Binomial heaps exhibit a similar relationship with binary numbers. These examples suggest a strong anology between representations of the number n and representations container objects of size n. Functions on the container strongly resemble arithmetic functions on the number. This analogy can be exploited to design new implementations of container abstractions – simply choose a representation of natural numbers with certain desired properties and define the function on the container objects accordingly. This design fashion is called numerical representation.
Given a positional number system, we can implement a numerical representation based on than number system as a sequence of trees. For example, the binary representation of 73 is 1001001, so a collection of size 73 in a binary numberical representation would contain three trees, of size 1, 8 and 64. Trees in numerical representations typically exhibit a very regular structure, for example, in binary numerical representations, all trees have size that are powers of 2. And there are three common kinds of trees that exhibit this structure are complete binary leaf trees, binomial trees, and pennants. Each tree with rank r has $ 2^r $ element.
Binary random-access lists
A random-access list, also called a one-sided flexible array, is a data structure that supports array-like lookup and update functions:
| |
We implement random-access lists using a binary numerical representation. A binary random-access list of size n contains a tree for each one in the binary representation of n. We choose the simplest combination of features: complete binary leaf trees and a dense representation:
| |
The integer in each Node is the size of the tree, this number is redundant. Trees are sorted in increasing order of size,
and the order of elements is left-to-right both within and between trees. The maximum number of trees in a list of size n
is log(n+1)(all position are one), and the maximum depth of any tree is logn. Inserting an element into a binary random-acess
list is analogous to increasing a binary number. The increment function on dense binary numbers like:
| |
Similarly, we first convert the element into a leaf, and then insert the leaf into the list follow the rule of inc:
| |
Deleting an element from a binary random-access list is analogous to decrementing a binary number:
| |
So the implementation also follow the dec rule:
| |
The lookup and update functions do not have analogous arithmetic operations. Rather, they take advantage of the organization of binary random-access lists as logarthmic-length lists of logarthmic-depth trees. Looking up or updating an element is a two stage process, we first search the list for the correct tree, and then search the tree for the correct element. To sum up we have:
| |
All operations run in O(logn) worst-case time. But cons, head and tail run in O(logn) not O(1) is one disappointing
aspect.
Zeroless representation
Currently, head is implemented via a call to unconsTree, this approach yields compact code unconsTree supports both
head and tail, but wastes time building lists that are immediately discard by head. For greater efficiency, we should
implement head directly. As a special case, head can easily be made to run in O(1) time whenever the first digit is
non-zero. So we seek to arrange that the first digit is always non-zero, one more principled solution is to use a zeroless
representation, in which binary numbers are constructed from ones and twos instead of zeros and ones:
| |
And the implementation of random-access list is:
| |
Now the head and tail take O(1) time.
Data-structural bootstrapping
The term “bootstrapping” refers to solving problems via solving simple instance of the same problem. And there are three algorithmic design techniques of data-structural bootstrapping. The first, structural decomposition, involves bootstrapping complete data structures from incomplete data structures. The second, structural abstraction, involves bootstrapping efficient data structures from inefficient data structures. The last bootstrapping data structures with aggregate elements from data structures with atomic elements.
Structural decomposition
Typically structural decomposition involves taking an implementation can handle objects only up to some bound size, and extending it to handle objects of unbounded size. Consider typical recursive datatypes such as lists and binary leaf trees:
| |
In some ways, these can be regarded as instances of some bounded size(zero for lists and one fo tree) and a rule for recursively decomposing larger objects into smaller objects until eventually each object is small enough to be handled by the bounded case.
However, both of these definitions are particularly simple that the recursive component in each definition is identical
to the type being defined, which called uniformly recursive. In general, we reserve the term structural decomposition
to describe recursive data structure that are non-uniform for cases that the recursive component is different from its
definition, e.g. type 'a seq = Nil | Cons of 'a * ('a * 'a) seq. But you can not implement structural decomposition
directly in ML, althought it allows the definition of non-uniform recursive datatypes. But the type system always disallow
the functions on such types like below:
| |
Fortunately it is always possible to convert a non-uniform type into a uniform type by introducing a new datatype to
collapse the different instances into a single type, for example, we can rewrite the seq as:
| |
Notice that the 'a ep type is isomorphic to binary leaf trees, so the version of 'a seq is equivalent to 'a tree list,
though we would tend to think a list of trees differently that we would think of a sequence of pairs – some algorithms
will seem simpler or more natural for one of the representations, and some for the other.
To use sequence represent positional number system, we need to represent zero, it’s easily corrected by adding another constructor of the type:
| |
Now we can represent the sequence 0…10 as One(0, One((1, 2), Zero(One((((3, 4), (5, 6)), ((7, 8), (9, 10))), Nil)))),
which has 11 elements, written 1101 in binary. The pairs in this type are always balanced. In fact, another way to think
of pairs of elements is as complete binary leaf trees. And then we can replement binary random-access lists use this type.
Tries
Binary search trees work well when comparisons on the key or element type are cheap. This is true for simple types like integers or characters, but may not be true for aggregate types like strings. A better solution for aggregate types such as string is to choose a representation that takes advantage of the structure of that type. One such representation is tries, also known as a digital search trees.
A trie is a multiway tree where each edge is labelled with a character. Edges leaving the root of a trie represent the
first character of a string, and so on. and if the node is vaild, which means it contains a value, we can mark it as
Some x, otherwise None. The critical remaining question is how to represent the edges leaving a node, we can represent
edges as a vector, an association list, a binary search tree, or even another trie. But all of these are just finite maps
from edges labels to tries. So we can use a given structure Map implementing finite maps over base type:
| |
Thus we can implement Trie as a functor from finiteMap to finiteMap:
| |