In much the same way that General Francisco Franco is still dead, volume
4 (Combinatorial Algorithms) of Donald Knuth's projected seven-volume
Art of Computer Programming is still not out. Neither are volumes 5 (Syntactical
Algorithms), 6 (Theory of Languages), or 7 (Compilers).
DDJ readers are probably familiar with the reason why: Knuth has been on
a ten-year detour from the Art of Computer Programming, working in
the field of computer typesetting (TeX) and typography (METAFONT). In addition
to producing the TeX and METAFONT software itself, Knuth has used this software
to produce an attractive five-volume series, Computers and Typesetting,
that includes not only the documentation for TeX and METAFONT, but also
their source code.
What DDJ readers may not be familiar with is that this source code is written
in a programming language called WEB. But when I say "written,"
I really do mean written: the source code, and the written description of
it, are one and the same. WEB is a language, quite similar to Pascal (there's
also CWEB, which is quite similar to C), which makes it possible to merge
the executable source for a system with its description. More importantly,
it allows you to construct and present the source in an order which makes
"psychological" sense. Using such a system, software "authors"
really do become authors, concerned with writing and presenting code in
a way that makes sense, not so much to the compiler, but to the reader.
Even if you're not interested in typesetting or typography take a look some
time at Knuth's TeX: The Program (Volume B of Computers and Typesetting)
and METAFONT: The Program (Volume D). You'll not find anywhere else
such detailed presentations of the entire source code--warts and all--for
a large program. It's instructive to consider what life would be like if
the source code for your favorite system were available as a WEB.
What inspired Knuth to issue these 600-page hardcover books of source code?
Literate Programming, a recently issued collection of essays by Knuth
and others from 1974 to 1989, contains a fascinating answer:
Tony Hoare provided a special impetus for WEB when he suggested in 1978 that I should publish my program for TeX. Since very few large-scale software systems were documented in the literature, he had been trying to promote the publication of well-written programs. Hoare's suggestion was actually rather terrifying to me, and I'm sure he knew that he was posing quite a challenge. As a professor of computer science, I was quite comfortable publishing papers about toy problems that could be polished up nicely and presented in an elegant manner; but I had no idea how to take a piece of real software, with all the compromises necessary to make it useful to a large class of people on a wide variety of systems, and to open it up to public scrutiny. How could a supposedly respectable academic, like me, reveal the way he actually writes large programs?Knuth tackles this problem with the idea of programs as webs:
When I first began to work with the ideas that eventually became the WEB system, I thought that I would be designing a language for "top-down" programming, where a top-level description is given first and successively refined. On the other hand I knew that I often created major parts of programs in a "bottom-up" fashion, starting with the definitions of basic procedures and data structures and gradually building more and more powerful subroutines. I had the feeling the top-down and bottom-up were opposing methodologies: one more suitable for program exposition and the other more suitable for program creation.This jumping around from top-down to bottom-up and back to top-down isn't just a matter of how source code gets presented, either. It cuts to the root of programming itself. As Knuth notes in an amazing essay from 1974 in this collection ("Structured Programming with go to Statements"):
But after gaining experience with WEB, I have come to realize that there is no need to choose once and for all between top-down and bottom-up because a program is best thought of as a web instead of as a tree.... When I'm writing a longish program ... I invariably have strong feelings about what part of the whole should be tackled next. For example, I'll come to a point where I need to define a major data structure and its conventions, before I'll feel happy about going further. My experiences have led me to believe that a person reading a program is, likewise, ready to comprehend it by Learning its various parts in approximately the order in which it was written... Sometimes the "correct" order is top-down, sometimes it is bottom-up, and sometimes it's a mixture; but always it's an order that makes sense on expository grounds.
Thus the WEB language allows a person to express programs in a "stream of consciousness" order.... the fact that there's no need to be hung up on the question of top-down versus bottom-up -- since a programmer can now view a large program as a web, to be explored in a psychologically correct order -- is perhaps the greatest lesson I have learned from my recent experiences. In other words, there's a way to present genuine, large programs, give the reader an understanding of how the entire system fits together, and still not brush aside messy issues like error recovery, special cases, system dependencies, tweaks for performance, hacks, kludges, and all the other seemingly nonalgorithmic issues that make up the bulk of a genuine program. Such details are crucial to one's understanding. They can't be "black boxed."
I have felt for a long time that a talent for programming consists largely of the ability to switch readily from microscopic to macroscopic views of things, i.e., to change levels of abstraction fluently.The name WEB of course comes straight from this notion of a program as a tangle of high-level and low-level issues.
... if you ask me whether keeping such a log has helped me learn how to reduce the number of future errors, my answer has to be No. I kept a similar log for errors in METAFONT, and there was no perceivable reduction. I continue to make the same kinds of mistakes.Oh well.