PhpRiot
News Archive
PhpRiot Newsletter
Your Email Address:

More information

Nikita Popov's Blog: The true power of regular expressions

Note: This article was originally published at PHPDeveloper on 15 June 2012.
PHPDeveloper

Nikita Popov has a new (language agnostic) post to his blog today about one of the most powerful things you can use in your development - something that a lot of developers don't understand the true power of - regular expressions.

As someone who frequents the PHP tag on StackOverflow I pretty often see questions about how to parse some particular aspect of HTML using regular expressions. A common reply to such a question is: "You cannot parse HTML with regular expressions, because HTML isn't regular. Use an XML parser instead." This statement - in the context of the question - is somewhere between very misleading and outright wrong. What I'll try to demonstrate in this article is how powerful modern regular expressions really are.

He starts with the basics, defining the "regular" part of "regular expression" (hint: it has to do with predictability) and the grammar of the expressions. He talks about the Chomsky hierarchy and how it relates to the "regular" as well as a more complex mapping of expression to language rules. He talks about matching context-free and context-sensitive languages and unrestricted grammars as well.