main page — COMP 526 Applied Algorithmics
Unit 4: String Matching – What’s behind Ctrl+F?
This is an archived version of this module from Spring 2020.
Click here for the current iteration.
This unit covers algorithms for exact string matching:
- terminology for strings
- string matching with determinstic finite automata
- Knuth-Morris-Pratt (KMP) algorithm
- Boyer-Moore (BM) algorithm
- Rabin-Karp (RK) algorithm
Learning outcomes
- Know and use typical notions for strings (substring, prefix, suffix, etc.).
- Understand principles and implementation of the KMP, BM, and RK algorithms.
- Know the performance characteristics of the KMP, BM, and RK algorithms.
- Be able to solve simple stringology problems using the KMP failure function.
Material
Lecture notes and video will be added once available.
- lecture notes
- slides
- tutorial problems
- Lecture 2020-02-25: Intro & Brute Force
- Lecture 2020-02-25: String Matching with Automata
- Lecture 2020-03-02: String Matching with Automata (part 2)
- Lecture 2020-03-02: Knuth-Morris-Pratt
- Lecture 2020-03-02: Knuth-Morris-Pratt prefix function
- Lecture 2020-03-03: Boyer-Moore
- Lecture 2020-03-03: Rabin-Karp
Further reading and sources
The naive string matching algorithm, string matching with automata and the Knuth-Morris-Pratt are discussed, e.g., in
The description of the Boyer-Moore algorithm follows my algorithms textbook (in German):
The presentation of Rabin-Karp and the failure-link automaton is inspired by material prepared for the module