main page  —  COMP 526 Applied Algorithmics

Unit 4: String Matching – What’s behind Ctrl+F?

This is an archived version of this module from Spring 2022.
Click here for the current iteration.

This unit covers algorithms for exact string matching:

  • terminology for strings
  • string matching with determinstic finite automata
  • Knuth-Morris-Pratt (KMP) algorithm
  • Boyer-Moore (BM) algorithm
  • Rabin-Karp (RK) algorithm

Learning outcomes

  1. Know and use typical notions for strings (substring, prefix, suffix, etc.).
  2. Understand principles and implementation of the KMP, BM, and RK algorithms.
  3. Know the performance characteristics of the KMP, BM, and RK algorithms.
  4. Be able to solve simple stringology problems using the KMP failure function.

Material

  • slides
  • lecture notes

  • Video 4-1 (2022-02-28): §4.1 Introduction

  • Video 4-2 (2022-02-28): §4.2 Brute force string matching

  • Video 4-3 (2022-02-28): §4.3 String matching with finite automata

  • Video 4-4 (2022-02-28): §4.4 Constructing string matching DFAs

  • Video 4-5 (2022-03-03): §4.5 The Knuth-Morris-Pratt algorithm

  • Video 4-6 (2022-03-03): §4.6 The Boyer-Moore algorithm

  • Video 4-7 (2022-03-03): §4.7 The Rabin-Karp algorithm


Further reading and sources

The naive string matching algorithm, string matching with automata and the Knuth-Morris-Pratt are discussed, e.g., in

The description of the Boyer-Moore algorithm follows our own algorithms textbook (in German):

The presentation of Rabin-Karp and the failure-link automaton is inspired by material prepared for the module


Unit 3  ⋅  Syllabus  ⋅  Unit 5