News Archive
PhpRiot Newsletter
Your Email Address:

More information

Literate programming with PHP

Note: This article was originally published at Planet PHP on 16 January 2011.
Planet PHP

noweb.php is a PHP implementation of the tool needed for literate programming. Wikipedia says the following about literate programming:

The literate programming paradigm, as conceived by Knuth, represents a move away from writing programs in the manner and order imposed by the computer, and instead enables programmers to develop programs in the order demanded by the logic and flow of their thoughts. Literate programs are written as an uninterrupted exposition of logic in an ordinary human language, much like the text of an essay, in which macros which hide abstractions and traditional source code are included. Literate programming tools are used to obtain two representations from a literate source file: one suitable for further compilation or execution by a computer, the "tangled" code, and another for viewing as formatted documentation, which is said to be "woven" from the literate source. While the first generation of literate programming tools were computer language-specific, the later ones are language-agnostic and exist above the programming languages.

noweb.php is able to facilitate this model of working by being able to extract program code from textual documents describing how the program ought to work. This document itself is such a description, and noweb.php PHP code can be generated from it.

The inspiration for creating noweb.php comes from Jonathan Aquino's work on implementing the same with Python. See his world's first executable blog post and the actual project on GitHub.


If you're interested in doing literate programming with PHP, grab the software produced by this document from GitHub:

noweb.php requires PHP 5.3 or newer.


noweb.php is a PHP tool that reads a text file with noweb-style annotated software code macros in it, parses it and writes the files defined in the document into the file system.

$ noweb.php tangle README.txt

The resulting code files will be written into the same directory where the document resides.

If you just want to see what code files the document defines, you can also run:

$ noweb.php list README.txt

Getting a file

When noweb.php starts, we check for the command line arguments to get a file. If no argument is found, we abort and give users instructions:

if (basename($_SERVER['argv'][0]) == 'php') { // The script was run via $ php noweb.php, tune arguments array_shift($_SERVER['argv']); } if (count($_SERVER['argv']) != 3) { die("Usage: noweb.php tangle \n"); }

We get the command and filename from the arguments:

$command = $_SERVER['argv'][1]; $readfile = $_SERVER['argv'][2];

The we check that the given file actually exists:

if (!file_exists($readfile)) { die("File {$readfile} not found\n"); }

And then we parse the file for any literate programming code, and extract the code. The workings of this is explained in detail later.

$noweb = new noweb(); $noweb-read_chunks($readfile);

We check the command given by the user and perform it:

switch ($command) { case 'list': $noweb-list_files(); break; case 'tangle': $noweb-tangle_files(dirname($readfile)); break; case 'weave': $noweb-weave($readfile); break; default: die("Unknown command {$command}. Try 'tangle'\n"); }

Reading the file

In a literate program the actual software code is stored in chunks inside the document. A chunk is defined by a name inside and =, and it ends on a line containing just @.

For parsing the chunks we will need two regular expressions:

private $chunk_start_regexp = '/^]+)=/'; private $chunk_end_regexp = '/^@$/';

When reading a literate programming file, we handle it on line-by-line basis, keeping track of whether we're inside a chunk or documentation:

$lines = file($filename); $chunk = null; foreach ($lines as $line) { }

With each line of the

Truncated by Planet PHP, read more at the original (another 7732 bytes)