• Contents
  • Preface
  • Curriculum Vitae
  • 0 Appetizer
  • 1 Introduction
    • 1.1 History
    • 1.2 Little Glossary of Quicksort Terms
    • 1.3 Recent Developments
    • 1.4 The Memory Wall
    • 1.5 Multiway Quicksort
    • 1.6 Aim and Scope of this Dissertation
    • Outline
    • 1.7 Related Work
      • 1.7.1 Towards Classic Quicksort
      • 1.7.2 Sedgewick's Work
      • 1.7.3 Practical Tweaks in the Nineties
      • 1.7.4 Off-the-Shelf Theorems for Solving Recurrences
      • 1.7.5 Analysis of Pivot Sampling
      • 1.7.6 Quicksort and Search Trees
      • 1.7.7 Limit Distributions
      • 1.7.8 Multiway Quicksort
      • 1.7.9 Equal Keys
      • 1.7.10 Bit Complexity and Symbol Comparisons
      • 1.7.11 Memory Hierarchy
      • 1.7.12 Branch Misses
      • 1.7.13 Academic Quicksort Variants
      • 1.7.14 My Previous Work
  • 2 Mathematical Tools
    • 2.1 Continuous Basics
      • 2.1.1 Asymptotic Notation
      • 2.1.2 Special Functions
      • 2.1.3 Convexity
      • 2.1.4 Lipschitz- and Hölder-Continuity
    • 2.2 Discrete Basics
      • 2.2.1 Binomial Coefficients
      • 2.2.2 Harmonic Numbers
    • 2.3 Generating Functions
      • 2.3.1 Singularity Analysis
      • 2.3.2 Euler Differential Equations and the Operator Method
    • 2.4 Random Basics
      • 2.4.1 Categorical Distribution
      • 2.4.2 Exponential Distribution
      • 2.4.3 Gamma Distribution
      • 2.4.4 Beta and Dirichlet Distributions
      • 2.4.5 Dirichlet-Calculus
      • 2.4.6 Binomial and Multinomial Distributions
      • 2.4.7 Beta-Binomial and Dirichlet-Multinomial Distributions
      • 2.4.8 Shannon Entropy
      • 2.4.9 Entropy and Binary Search Trees
    • 2.5 Discrete Entropy
      • 2.5.1 Entropy of Dirichlet Vectors
      • 2.5.2 Properties of the Harmonic-Difference Function
      • 2.5.3 Properties of the Discrete Entropy
    • 2.6 A Master Theorem for Distributional Recurrences
      • 2.6.1 The Continuous Master Theorem
      • 2.6.2 Distributional Recurrences
      • 2.6.3 A Simple Distributional Master Theorem
      • 2.6.4 Self-Contained Proof of an Educational DMT
      • 2.6.5 The Distributional Master Theorem
  • 3 Models and Assumptions
    • 3.1 Models of Input Distributions
      • 3.1.1 The Random-Permutation Model
      • 3.1.2 The Expected-Profile Model
      • 3.1.3 The Exact-Profile Model
      • 3.1.4 Enforcing Random Order
    • 3.2 Cost Models
      • 3.2.1 Information Costs: The Comparison Model
      • 3.2.2 Rearrangement Costs: Swaps and Write Accesses
      • 3.2.3 Primitive Instruction Counts
      • 3.2.4 The Memory Hierarchy: Scanned Elements and Cache Misses
      • 3.2.5 Pipelined Execution: Branch Mispredictions
  • 4 Quicksort
    • 4.1 From Partitioning to Sorting
    • 4.2 Famous Quicksort Variants
      • 4.2.1 Classic Quicksort
      • 4.2.2 Lomuto's Partitioning
      • 4.2.3 Yaroslavskiy-Bentley-Bloch Quicksort
      • 4.2.4 Sedgewick's Dual-Pivot Method and Kciwegdes Partitioning
      • 4.2.5 Waterloo Quicksort
      • 4.2.6 Algorithmic Improvements
    • 4.3 Generic One-Pass Partitioning
      • 4.3.1 Master-Pivot Schemes
      • Algorithm 9: Partition
      • 4.3.2 Master-Segment Schemes
      • 4.3.3 Known Methods as Generic One-Pass Instances
    • 4.4 Choosing Pivots From a Sample
      • 4.4.1 Generalized Pivot Sampling
      • 4.4.2 Random-Parameter Pivot Sampling
    • 4.5 Randomness Preservation in the Presence of Pivot Sampling
    • 4.6 Other Partitioning Methods
      • 4.6.1 Removing Duplicates
      • 4.6.2 Linear Memory Or Several Passes
      • 4.6.3 Hennequin's Generalized Quicksort
      • 4.6.4 Relation to Concurrent Work of Aumüller et al.
  • 5 Analysis of Generic Partitioning
    • 5.1 Toll-Function Notation
    • 5.2 Stochastic Description of Partitioning Under Pivot Sampling
      • 5.2.1 Class Probabilities
      • 5.2.2 Segment Sizes and Subproblem Sizes
      • 5.2.3 Distribution of Pivot Values
      • 5.2.4 Random-Parameter Pivot Sampling
    • 5.3 Generic Model: Element-wise Charging Schemes
      • 5.3.1 General Charging Schemes
      • 5.3.2 Relation to Classification Strategies
      • 5.3.3 Separating State Frequencies From Element Costs
      • 5.3.4 FSM-based Element-Wise Charging Schemes
    • 5.4 Charging Schemes for Our Cost Measures
      • 5.4.1 The State Automaton For One-Pass Partitioning
      • 5.4.2 Key Comparisons
      • 5.4.3 Scanned Elements
      • 5.4.4 Write Accesses
      • 5.4.5 Swaps resp. Cyclic Shifts
    • 5.5 Execution Frequencies of Inner Loops
      • 5.5.1 Master-Pivot Algorithm
      • 5.5.2 Master-Segment Algorithms
      • 5.5.3 A Note on Limit Distributions
    • 5.6 Class-State Co-Occurrences
    • 5.7 Comparison-Optimal Partitioning
    • 5.8 Branch Misses
      • 5.8.1 Conditional Independence
      • 5.8.2 A Generic Comparison-Branch Location
  • 6 The Quicksort Recurrence
    • 6.1 Recursive Description of Costs
      • 6.1.1 Nitpicks on Generalized Pivot Sampling
      • 6.1.2 Recurrence for Expected Costs
      • 6.1.3 Relation to Combinatorial Form for Subproblem Size Probabilities
    • 6.2 Back-of-the-Envelope Approaches
      • 6.2.1 Van Emden's Entropy-Reduction Argument
      • 6.2.2 A Continuous Recurrence
      • 6.2.3 Explicit Solution of Continuous Recurrence
      • 6.2.4 A Master-Theorem-Style Argument
      • 6.2.5 Leading Term by Ansatz
    • 6.3 Solution with the Distributional Master Theorem
    • 6.4 Random-Parameter Pivot Sampling
    • 6.5 Solution with Generating Functions
      • 6.5.1 A Differential Equation for the Generating Function
      • 6.5.2 Differential-Operator Transformation
      • 6.5.3 A General Solution to the Differential Equation
      • 6.5.4 Coefficient Asymptotics
  • 7 Results and Discussion
    • 7.1 Average Costs
    • 7.2 Disclaimer
    • 7.3 Simulating Multiway Partitioning by Binary Partitioning
      • 7.3.1 Waterloo vs. Classic
      • 7.3.2 Entropy-Equivalent Sampling
      • 7.3.3 s-Way vs. Binary Partitioning
    • 7.4 The Optimal Meeting Point for Indices
      • 7.4.1 Comparisons
      • 7.4.2 Scanned Elements
      • 7.4.3 Summary
    • 7.5 Optimal Pivot Sampling
      • 7.5.1 Optimal Sample Sizes?
      • 7.5.2 Entropy-Tight Pairs
      • 7.5.3 Linear Costs With Exact-Quantile Pivots
      • 7.5.4 Scanned Elements
      • 7.5.5 An Optimality Criterion for Finite Sample Sizes and Gibb's Gap
      • 7.5.6 Two Comparisons Trees: Non-Linear Costs
      • 7.5.7 Optimal t: Heuristics and Tables
    • 7.6 The Optimal Number of Pivots
      • 7.6.1 Key Comparisons
      • 7.6.2 Sampling with a Limit Density
      • 7.6.3 Scanned Elements
      • 7.6.4 Exponential-Decay Sampling
      • 7.6.5 Optimal s for fixed k
    • 7.7 Optimal Comparison Trees
      • 7.7.1 The Single-Tree Case
      • 7.7.2 Two Comparison Trees
      • 7.7.3 The Optimal Alphabetic Tree Problem
      • 7.7.4 Entropy-Bounds for Comparison Trees
      • 7.7.5 Optimal Choices for A Single Tree
      • 7.7.6 The Benefit of Two Trees
      • 7.7.7 Concluding Remarks on Optimal Comparison Trees
    • 7.8 Interactions between Parameters: The Jellyfish Paradox
  • 8 Equal Keys
    • 8.1 Introduction
      • 8.1.1 Ignoring Equals
    • 8.2 Input Models with Equal Keys
    • 8.3 Related Work
    • 8.4 Fat-Pivot Partitioning
      • 8.4.1 Existing Methods
      • 8.4.2 New Concepts and Notations
      • 8.4.3 Generic Multiway Fat-Pivot Partitioning
      • 8.4.4 Fat-Separator Methods
      • 8.4.5 Fat-Outsides Methods
    • 8.5 Partitioning Costs
      • 8.5.1 Stateless Cost Measures
      • 8.5.2 Stochastic Model
      • 8.5.3 (Ternary) Comparisons
      • 8.5.4 Scanned Elements (Partitioning)
      • 8.5.5 Scanned Elements (Cleanup)
    • 8.6 Recurrence for Expected-Profile Inputs
      • 8.6.1 Distributional Recurrence
    • 8.7 Quicksort Recursion Trees
      • 8.7.1 Recursion Trees
      • 8.7.2 t-Fringe-Balanced s-Ary Search Trees with Leaf-Buffer Size w
      • 8.7.3 Recursion Trees and stwl-Trees
      • 8.7.4 Generic Search Costs
      • 8.7.5 Search Costs in Expected-Profile Models
      • 8.7.6 Distributional Recurrence for Weighted Path Length
      • 8.7.7 Expected Weighted Path Length for Random U-ary Words
    • 8.8 Quicksort Under Stateless Cost Measures
      • 8.8.1 Random u-Ary Words
      • 8.8.2 Quicksort Costs Value-Wise
      • 8.8.3 Many Duplicates and Degeneracy
      • 8.8.4 Proof of Theorem 8.16
    • 8.9 Discussion
  • 9 Conclusion
    • 9.1 The Optimal Quicksort?
    • 9.2 Open Problems and Future Work
  • A Index of Notation
    • A.1 Generic Mathematical Notation
    • A.2 Stochastics-related Notation
    • A.3 Input to the Algorithm
    • A.4 Notation for the Algorithm
    • A.5 Notation Specific to the Analysis (Uniform Model)
    • A.6 Notation for the Analysis with Equal Keys
  • B Bibliography