Research

Quick Links: all publications ⋅ DBLP ⋅ Google Scholar ⋅ ORCID ⋅ CV (pdf)

My current research revolves around two themes:

Space-efficient data structures, computing over compressed data
Representing data with minimal storage requirements has a long tradition for communication, but in recent years it has gained importance for maximizing the size of data sets that can be kept in fast memory. Unlike for traditional compression, this requires algorithms that can run directly on the compressed representation to make use of the data. I develop such methods for various problems whose space requirements adapt to the data at hand.
Analysis of algorithms, algorithm science, adaptive algorithms
I work on the design of efficient algorithms and data structures, and their mathematical analysis. By determining exact constant factors and analyzing how their performance depends on characteristics of the input (adaptive analysis), I develop improvements for fundamental building blocks of computer science like sorting, selecting, and searching in dictionaries.

Below is more detailed information for individual projects.

Powersort

We found a practical drop-in replacement of Timsort’s merge rules that is provably better, can be implemented efficiently, and gives a principled solution of the underlying problem. Powersort has replaced Timsort’s merge policy in the list sorting methods of CPython and PyPy (covering the vast majority of Python installations).

Timsort is a widely used adaptive sorting method that exploits existing runs in the input. A correct, efficient implementation is not trivial as witnessed by the series of bugs in the Python and Java libraries.

I presented Powersort and its story at PyCon US 2023, the largest Python community conference (blog post).

Quicksort

Java nowadays uses a dual-pivot Quicksort, the Yaroslavskiy-Bentley-Bloch (YBB) Quicksort. I did the first average-case analysis of this algorithm, and an extension of the analysis of YBB Quicksort became my PhD project.

To see dual-pivot Quicksort in action, check out the animated visualization of the algorithm (thanks to Brad Lyon)!

My talk “Quicksorts of the 21st century” was an invited contribution to a workshop commemorating the 60th anniversary of Tony Hoare’s invention of Quicksort.
I describe my work on the analysis of Dual-Pivot Quicksort and also mention QuickXSort.

Apart from the analysis of the new Quicksort variant, I was also able to fill some remaining gaps in the our understanding of standard Quicksort.

We found the first precise and rigorous average-case analysis of branch mispredictions in Quicksort.
I found a way to analyze median-of-k Quicksort on equal keys. The extension of Robert Sedgewick’s analysis was open since 1977. Even though my analysis applies only when there are many duplicates present, this is one of the results I am most proud of.

Rectification of Names. Note that I referred to YBB partitioning simply as Yaroslavskiy’s algorithm in my publications finished before 2016, but besides Vladimir Yaroslavskiy, also Jon Bentley and Joshua Bloch were involved in the development of the algorithm early on, so it is more appropriate to call their algorithm YBB Quicksort.

Apportionment and Stick Cutting

Two seemingly unrelated problems can be solved by the same algorithmic idea: the stick-cutting problem and proportional apportionment.

Game Theory

With a group of peers I’m exploring models for the evolution of cooperation and its interplay with polarization.