Literate Programming -- Propaganda and Tools


Table of contents


Overview

A Rationale for literate programming

Literate programming is an approach to programming which emphasises that programs should be written to be read by people as well as compilers. From a purist standpoint, a program could be considered a publishable-quality document that argues mathematically for its own correctness. A different approach is that a program could be a document that teaches programming to the reader through its own example. A more casual approach to literate programming would be that a program should be documented at least well enough that someone could maintain the code properly and make informed changes in a reasonable amount of time without direct help from the author. At the most casual level, a literate program should at least make its own workings plain to the author of the program so that at least the author can easily maintain the code over its lifetime.

The problem

Why is this a Good Thing? I suppose it depends on how you feel about programming. In some sense, if one is up against a deadline for getting code finished and working, trying to make a literate program instead of a working program might seem like a very bad idea. However, in a long-term project (code you don't plan to throw away in the near future), literate programming actually seems to pay off (although I have never seen a study to this effect). Why can this style of program development be beneficial?

One element of programming that leads to the problems listed above is that programming languages are designed more for encouraging people to write code for a compiler to understand than for other people to understand. This is particularly true for those of us who write in the C language, which is very much a low-level language and is often considered to be a portable and easy-to-use assembly language with some nice standard libraries. One nice aspect of C (besides pointers, etc...) is that it is very terse and thus it requires very few keystrokes to implement very powerful ideas. The price for this terseness is it often tends to make code more difficult to read by people. As the problems above indicate to me, this is probably a case where the cost often outweighs the benefits over the course of a long-term development effort (see the April fool's article from Computerworld). Of course, some people do write readable C programs, but it is definately a hard-learned skill rather than any widespread natural ability.

One Solution

Literate programming is defined as the combination of documentation and source together in a fashion suited for reading by human beings.

This very general definition seems to me to indicate that it is one way to begin to address the difficulties with programming that I indicated in the previous section. I discovered literate programming about two years ago, and have not found too many other people (at least that I know of) who use this `programming methodology', but I suspect that its use is not too uncommon in general. The Getting Start(l)ed document in the Literate Programming Library gives an introduction to the thinking behind literate programming.

Donald Knuth ( interview on CWEB, "Why I Must Write Readable Programs") coined the term literate programming and created the original literate programming tool/language, WEB, which he used to write TeX and MetaFont. The literate-programming FAQ quotes Knuth as saying

The philosophy behind WEB is that an experienced system programmer, who wants to provide the best possible documentation of his or her software products, needs two things simultaneously: a language like TeX for formatting, and a language like C for programming. Neither type of language can provide the best documentation by itself; but when both are appropriately combined, we obtain a system that is much more useful than either language separately.

The structure of a software program may be thought of as a web that is made up of many interconnected pieces. To document such a program we want to explain each individual part of the web and how it relates to its neighbours. The typographic tools provided by TeX give us an opportunity to explain the local structure of each part by making that structure visible, and the programming tools provided by languages such as C or Fortran make it possible for us to specify the algorithms formally and unambiguously. By combining the two, we can develop a style of programming that maximizes our ability to perceive the structure of a complex piece of software, and at the same time the documented programs can be mechanically translated into a working software system that matches the documentation.

The source code for TeX and MetaFont is available in book format, as printed from the TeX output of the WEB sources for these programs. What I found impressive about these listings (although I did not look at them in too much depth) is that Knuth stated in the introduction to the TeX code that he believed that TeX was finished, and that he believed the last bug had been found and corrected in the code. The fact that I had a hard time conceiving of a truly `finished' program and a truly bug-free program indicated to me (a) that (of course) Dr. Knuth is a far better programmer than I, and (b) that his programming methodology was probably far more reliable than the ones I had previously seen and used.

Tools

I have heard (some) about two WEB systems that I have not used, and have used three WEB systems. If anyone has experience with other systems and comments about them, I would like to add that information too. See the LP FAQ for more information on each of these tools (and many other tools). I will try to install some of these tools if people want to try them out.

Examples

Cool Stuff

Some ideas:

References

See Also

Literate Programming Web ring
[ Previous 5 Sites | Skip Previous | Previous | Next | Skip Next | Next 5 Sites | Random Site | List Sites ]

VASC Contact

Chris Lee

History

10/17/94 Created by Christopher Lee
Christopher Lee | chrislee@ri.cmu.edu
Last modified: Thu Mar 16 11:34:47 2000
This page has been accessed [count] times.