Andrii Raikov

How Regex Works: The Secret Sauce Behind Pattern Matching

What if you could build a regex engine from scratch? Learn how to translate any pattern into a simple state machine for lightning-fast matching.

How Regex Works: The Secret Sauce Behind Pattern Matching
#1about 4 minutes

Solving a LeetCode regex matching challenge

A LeetCode problem requiring a custom regex matcher for the dot and asterisk characters serves as the motivation for exploring how regex engines work.

#2about 2 minutes

Understanding the scalability limits of a naive approach

The initial linked-list implementation is not extensible enough to handle complex patterns like a real-world email validation regex.

#3about 3 minutes

How regex works using finite automata

Regular expressions can be modeled as finite automata, or state machines, which describe a set of character strings through states and transitions.

#4about 2 minutes

Comparing deterministic and non-deterministic finite automata

Non-deterministic Finite Automata (NFA) are more flexible for representing regex because a single input can lead to multiple possible next states.

#5about 5 minutes

Building an NFA from regular expression components

A complex regular expression is converted into an NFA by combining smaller NFA components for concatenation, alternation, and repetition operators.

#6about 2 minutes

The performance pitfalls of the backtracking algorithm

The backtracking approach, where the engine guesses a path and reverts on failure, can lead to exponential time complexity for complex patterns.

#7about 1 minute

Simulating an NFA by tracking all possible states

A more efficient simulation method avoids backtracking by simultaneously tracking the entire set of possible states the machine could be in at any point.

#8about 5 minutes

Implementing a simple NFA regex matcher in Go

A practical implementation involves compiling the regex pattern into an NFA data structure and then iterating through the input string while tracking the current set of active states.

#9about 2 minutes

Beyond one algorithm for real-world regex engines

Production-grade regex engines often use a hybrid approach, selecting from different algorithms like NFA simulation or backtracking based on the specific pattern.

Related jobs
Jobs that call for the skills explored in this talk.
tree-IT GmbH

tree-IT GmbH
Bad Neustadt an der Saale, Germany

54-80K
Intermediate
Senior
Java
TypeScript
+1

Featured Partners

Related Articles

View all articles
DC
Daniel Cranney
Dev Digest 213: Petrol Prices, Agentic Workflows, AI Skills and CODE100!
Inside last week’s Dev Digest 213 . 🤫 Don’t tell your LLM that it is an expert 👻 AI generated code is invisible 🔄 Learn about agentic workflows 🛡️ Linux Foundation sponsors fight against AI slop 🦠 1M users infected by Chrome extension 🫃 The why of J...
 Dev Digest 213: Petrol Prices, Agentic Workflows, AI Skills and CODE100!

From learning to earning

Jobs that call for the skills explored in this talk.