algorithms
teaching
Amortized analysis of resizingarray stacks
A rigorous proof that a stack implemented with doubling arrays has constant amortized time operations; written up here since it does not seem to appear in any of the standard algorithms books.
A wellknown, fundamental data structure is the implementation of a stack using resizing arrays (a.k.a. doubling arrays), where we maintain an array of $C$ items for the $n$ elements of a stack, and whenever the array becomes full, we double its size, and whenever the array becomes less that one quarter full, we halve its size. This maintains the invariant that $\frac14 C \le n \le C$.
A folklore analysis shows that this achieves constant amortized cost for all stack operations, despite the occasional expensive resizing operations.
This analysis is not a particularly hard or surprising proof by any means, but it makes a great first nontrivial example of amortized analysis, and hence I wanted to show it in my Efficient Algorithms (COMP526) lectures; see Unit 2 – Fundamental Data Structures for the full context.
The goal is to show that while any individual push/pop in a resizingarray based stack might be expensive ($\Omega(n)$ cost), any sequence of operations is necessarily much cheaper, namely $O(1)$ time per operation on average. As the dominant operation, we count array accesses, i.e., any read or write access to an array.
Part 1: Amortized costs for all operations
Basically, each operation has two types of costs for the amortized analysis: actual costs (# array accesses) and a change in potential/credits. We define the potential $\Phi = \min\lbrace n\frac14C,\;Cn\rbrace$, and the amortized cost $a_i$ of an operation is the actual cost plus $4$ times the change in potential. The intuition behind $\Phi$ is to measure the distance of the current filling mark $n$ from the “expensive boundaries” $\frac14C$ resp. $C$.
We have to analyze both costs separately.
Actual costs:
 cheap push/pop: exactly 1 array access to write/read the topmost element.
 copying push: currently there are $n$ elements on the stack, these have to be read from the old array ($n$ accesses) and written to the new array ($n$ accesses); also one more element has to be added (like in cheap push). In total that is $2n+1$ actual cost.

copying pop: actually exactly the same: there are $n$ elements on the stack, these have to be read from the old array ($n$ accesses) and written to the new array ($n$ accesses); also one element has to be read to be returned. In total that is $2n+1$ actual cost.
(One could avoid this very last extra read by not copying the element that we pop right after anyways; but typical implementations do not do this for convenience. It would clearly not save much either way.)
Credits / Potential change
The credits is the change in potential $\Phi = \min\lbrace n\frac14C,\;Cn\rbrace$.
 cheap push: $n$ gets one bigger, but $C$ is unchanged. If $Cn < n\frac14 C$, then $\Phi$ drops by one (“we lose one credit”).
 cheap pop: $n$ gets one smaller, but $C$ is unchanged. If $n\frac14 C<Cn$, then $\Phi$ drops by one (“we lose one credit”).
 copying push: We must have had $n=C$ (i.e. $\Phi_{i1}=0$) before this push, and we will now set $C=2n$ before the push. Then, the push increments $n$. That means the new potential $\Phi_i=(n+1)\frac14\cdot2n=\frac12n+1$. We have earned $\frac12n+1$ credits.
 copying pop: We must have had $n=\frac14C$ (i.e. $\Phi_{i1}=0$) before this push, and we will now make $C=2n$ before the pop; the pop itself then decrements $n$. So $\Phi_i=(n1)\frac14\cdot 2n = \frac12n1$, and we have earned $\frac12n1$ credits.
Adding up
Adding up actual cost and $4(\Phi_i\Phi_{i1})$ shows that in each case the amortized costs are at most 5.
Part 2: From amortized to total actual costs
The second part is indeed the same for all amortized analyses: The total actual cost over a sequence of $m$ operations is essentially bounded by the sum of their amortized costs, plus initial/final potential; this is shown using a telescopingsum argument:
\[5m \ge \sum_{i=1}^m a_i = \sum_{i=1}^m c_i  4 \underbrace{\sum_{i=1}^m(\Phi_i  \Phi_{i1})}_{=\Phi_m  \Phi_0}\]Rearranging gives
\[\sum_{i=1}^m c_i \le 5m + 4\Phi_m4\Phi_0\]Now, we can also show using the invariant $\frac14 C \le n \le C$, i.e., $n\le C \le 4n$, that $0\le \Phi\le \frac35n$: Since $\Phi$ is piecewise linear, it suffices to consider the endpoints of the linear segments, i.e., $C = 4n$, $C = n$ and $n\frac14 C = Cn$, i.e., $C = \frac85 n$; at these points $\Phi$ has values $0$, $0$, and $\frac35 n$, respectively.
Hence $\displaystyle\sum_{i=1}^m c_i \le 4\Phi_m 4\Phi_0 \le 5m + 2.4n \in \Theta(m+n)$.