There are few rites of programmer passage as iconic as writing your first parser. You might want to interpret or compile a scripting language, or you might want to accept natural-language-like commands. You need a parser. [Varunramesh] wants to show you parser combinators, a technique used to make practical parsers. But the demonstration using interactive code cells in the web page is nearly as interesting as the technique.

Historically, you parse tokens, and this technique can do that too, but it can also operate directly on character streams if you prefer. The idea is related to recursive descent parsing, where you attempt to parse certain things, and if those things fail, you try again.

There are ways to match in a fuzzy way using Levenshtein distance. That way, if the user enters a typo, you can often recover from it. You could probably implement other schemes for this, too, like soundex, if you were parsing names. These types of parsers do have some limitations, but they are much easier to create, maintain, and debug than traditional parsers.

This is the first part in a series on creating parsers with combinators. Future installments promise to cover abstract syntax trees, error reporting, infix operations, and limiting backtracking. We’ll be interested to read those, too.

If you want some different parser tutorials, we got you. The usual tough problem is algebraic expressions, although you can always try RPN.