main page — CS 566 Efficient Algorithms
Unit 6: String Matching – What’s behind Ctrl+F?
This is an archived version of this module from Winter 2024-25.
This unit covers algorithms for exact string matching:
- terminology for strings
- string matching with determinstic finite automata
- Knuth-Morris-Pratt (KMP) algorithm
- Boyer-Moore (BM) algorithm
- Rabin-Karp (RK) algorithm
Learning outcomes
- Know and use typical notions for strings (substring, prefix, suffix, etc.).
- Understand principles and implementation of the KMP, BM, and RK algorithms.
- Know the performance characteristics of the KMP, BM, and RK algorithms.
- Be able to solve simple stringology problems using the KMP failure function.
Material
- slides
- Video 6-1 (2024-11-18):
§6.1 String definitions
- Video 6-2 (2024-11-18):
§6.2 Brute force string matching
- Video 6-3 (2024-11-18):
§6.3 String matching with finite automata
- Video 6-4 (2024-11-19):
§6.4 Constructing string matching DFAs
- Video 6-5 (2024-11-19):
§6.5 The Knuth-Morris-Pratt algorithm
- Video 6-6 (2024-11-19):
§6.6 The Boyer-Moore algorithm
- Video 6-7 (2024-11-25):
§6.7 The Rabin-Karp algorithm
Further reading and sources
The naive string matching algorithm, string matching with automata and the Knuth-Morris-Pratt are discussed, e.g., in
The description of the Boyer-Moore algorithm follows our own algorithms textbook (in German):
The presentation of Rabin-Karp and the failure-link automaton is inspired by material prepared for the module