Hypersuccinct Trees – New universal tree source codes for optimal compressed tree data structures and range minima
In this paper, we continue the study of distribution-succinct data structures, i.e., data structures that have (asymptotically) optimal expected space usage (as opposed to only optimal worst case) for a certain data-structuring task, and give a single tree data structure that serves as a universal source code for binary and ordinal trees.
Entropy trees were a precursor to this work, in particular we therein introduced the notion of hypersuccinct trees (a tree-covering data structure with a Huffman code for micro-tree types) and noted that it achieves the optimal expected space of $1.736n + o(n)$ bits to represent a random binary search tree or (equivalently) a range-min data structures for a random permutation.
Here, we show that this same data structure actually achieves optimal compression up to lower order terms for a whole range of tree sources, and in particular all of following distributions of tree shapes:
- binary search trees (BSTs) built from random insertions,
- Cartesian trees of random arrays,
- random fringe-balanced BSTs,
- binary trees with a given number of binary/unary/leaf nodes,
- random binary tries generated from memoryless sources,
- (uniformly chosen) full binary trees (a.k.a. extended binary trees),
- (uniformly chosen) unary paths,
- (uniformly chosen) weight-balanced BSTs,
- (uniformly chosen) AVL trees, and
- (uniformly chosen) left-leaning red-black trees.
Hypersuccinct trees combine the compression achieved by the best known universal codes for binary trees with the operation support of the best succinct tree data structures, and they do so “automatically” (without tailoring to the tree family at hand).