main page  —  COMP 526 Applied Algorithmics

Unit 7: Compression

This is an archived version of this module from Fall 2022.
Click here for the current iteration.

This unit covers classic compression techniques; we discuss

  • character encodings (ASCII, UTF-8)
  • Huffman codes
  • run-length encoding (RLE)
  • Elias gamma codes
  • Lempel-Ziv-Welch (LZW)
  • move-to-front (MTF) text transform
  • Burrows-Wheeler transform (BWT)

including their properties and what types of redundancies in data they can exploit.

Learning outcomes

  1. Understand the necessity for encodings and know ASCII and UTF-8 character encodings.
  2. Understand (qualitatively) the limits of compressibility.
  3. Know and understand the algorithms (encoding and decoding) for Huffman codes, RLE, Elias codes, LZW, MTF, and BWT, including their properties like running time complexity.
  4. Select and adapt (slightly) a compression pipeline for specific type of data.

Material

  • slides
  • lecture notes

  • Video 7-1 (2022-11-28): §7.1 Compression context: terminology, Kolmogorov complexity

  • Video 7-2 (2022-11-28): §7.2 Character Encodings

  • Video 7-3 (2022-11-30): §7.2 Character Encodings

  • Video 7-4 (2022-11-30): §7.3 Huffman codes

  • Video 7-5 (2022-11-30): §7.4 Entropy and Huffman codes

  • Video 7-6 (2022-11-30): §7.5 Run-length encoding

  • Video 7-7 (2022-11-30): §7.6 Lempel-Ziv-Welch encoding

  • Video 7-8 (2022-12-05): §7.7 Lempel Ziv Welch decoding

  • Video 7-9 (2022-12-05): §7.8 Move-to-front transform

  • Video 7-9 (2022-12-07): §7.9 Burrows-Wheeler transform
    This is an archive video covering the same material; audio in recording was broken.

  • Video 7-10 (2022-12-07): §7.10 Inverse BWT
    This is an archive video covering the same material; audio in recording was broken.


Further reading and sources

There is no textbook that covers all topics, but parts can be found in the following sources, all of which cover much more than the topics I selected.

Character encodings, Huffman codes, LZW:

The exposition of BWT is my own.

The presentation of the unit is inspired by material prepared for the module


Unit 6  ⋅  Syllabus  ⋅  Unit 8