[ ERCB Home |
New |
Feature |
Brief |
DDJ |
Letters |
Links
]
Subatomic Programming
Review by Andrew Schulman
Copyright (C) Dr. Dobb's Journal, March, 1991
Most programs are written in a high-level language, not assembly language,
but the authors of these programs are generally at least dimly aware that,
below the surface, their high-level-language statements such as p = x "turn
in" assembly language statements such as MOV AX, [BX]. Even introductory
books on computing always seem to include a picture of a funnel, with LETs
and GOTOs flowing in the top, and MOVs and JMPs dropping out the bottom.
It is probably a sign of progress in computing that most of us view these
MOVs and JMPs as atomic operations. That is, they don't "turn into"
anything, except perhaps the "0s and 1s" to which introductory
computer books like to vaguely refer. For the majority of programmers, the
actually enormous complexity underneath the surface of an "atomic"
assembly language statement like MOV AX, [BX] can remain a total mystery.
Nonetheless, it is worth having an appreciation for what makes up these
supposedly simple operations. In addition to the pure enjoyment of knowing
a little more about the machine, an appreciation for its subatomic particles
-- things like bus cycles, memory access time, instruction prefetches, pipelining,
DRAM refresh, timing issues, cache management, wait states, DMA, and bus
arbitration -- may become more important as microprocessors become faster
and more compact. As Intel's new 386SL chipset shows, even a seemingly lowly
issue like power management can take on great importance when computers
get small enough.
This month, we will examine three books that take us beneath the valley
of assembly language.
Zen of Assembly Language
Michael Abrash's oddly titled Zen of Assembly Language is a good
place to start. Chapters 3, 4, and 5 in particular deal with what he calls
"the raw stuff of performance, which lies beneath the programming interface,
in the dimly seen realm populated by instruction prefetching, dynamic RAM
refresh, and wait states, where software meets hardware" (p. 75).
Interestingly, Abrash's goal is actually to show that we can't totally understand
this level. "The exact performance of assembler code over time is such
a complex problem that it might as well be unsolvable" (p. 114). He
shows all instruction timings are relative. In one example code sequence,
the SHR instruction takes eight-plus cycles to execute, and in another it
takes only two. Thus, "the only true execution time for an instruction
is a time measured in a certain context, and that time is meaningful only
in that context" (p. 91).
In other words, "there's no way to be sure what code is the fastest
for a particular purpose"; one must "write code by feel as much
as by prescription." Apparently such thoughts are what inspired the
"Zen" book title. "How can it not be possible to come up
with a purely rational solution to a problem that involves that most rational
of man's creations, the computer?" he asks (p. 113), yet the answer
is, at this subatomic level, that the order and duration of events is unknown.
In a particularly nice demonstration, Abrash hooks a logic analyzer up to
the 8088 and PC bus, and examines the following simple instruction sequence:
i db 1
j db 0
mov ah, ds:[i]
mov ds:[j], ah
The result is a timeline of "170 Cycles in the Life of a PC" (pp.
119-121), in which we see the 8088's execution unit load up opcodes from
the instruction prefetch queue, the bus-interface unit reload the instruction
queue from memory, the occurrence of DRAM refresh reads, wait states, and
so on. And we see even these simple instructions behave differently (execute
at different speeds) at different times.
Abrash's own conclusion is "code execution isn't all that exciting
... it's awfully tedious, even by assembler standards. During the entire
course of the figure only seven instructions are executed -- not much to
show for all the events listed." Abrash's point is that such a "microanalysis
... is not only expensive and time consuming, but also pointless."
Yet, for most readers this is the most fascinating part of the book! Abrash's
book can be used, not only as a guide to assembly language performance issues,
but also as a fine explanation of what really happens "inside"
a MOV instruction.
Of course, the title is slightly misleading, in that he is talking about
Intel assembly language, not assembly language in general. Furthermore,
the focus is far too much on the 8088 than seems appropriate now that the
baseline PC machine is 80286-based. Abrash takes a perverse pleasure in
the poor quality of the 8088, because clearly the worse the chip, the more
one needs assembly language optimizations! However, he does devote an entire
chapter to "Other Processors" (the 80286 and 80386); this chapter
alone is worth the price of the book.
And perhaps the book's 8088 focus may not be so off base, after all. Abrash
points out, "If you're going to go to the trouble of using 80386-specific
features, thereby eliminating any chance of running on PCs and ATs, you
might as well go all the way and write 80386 protected-mode code" (p.
716). In a book on real-mode programming, then, perhaps there isn't much
to say on the 80286 and 80386. "The protected-mode 80386 is a wonderful
processor to program, and a good topic -- a terrific topic -- for some book
to cover in detail, but this is not that book" (p. 717).
Even Abrash's entire chapter on the 8080 (!) is not so out of place for
the 1990s. "You no doubt think you've seen the last of the venerable
but not particularly powerful 8080. Not a chance. The 8080 lingers on in
the instruction set and architecture ... Although it may seem strange that
the design of an advanced processor would be influenced by the architecture
of a less capable one, that practice is actually quite common" (p.
266). As a result, even the spiffiest 486 has many features in common with
the 8080, a glorified calculator chip. This chapter of the book ("Strange
Fruit of the 8080") makes particularly enjoyable reading, because it
shows how minor engineering decisions live on for many years. A frightening
thought.
Structured Computer Organization
Our next book, Tanenbaum's Structured Computer Organization, may
not at first seem relevant. What does this venerable (now in its third edition)
computer architecture textbook have to do with the bizarre world Abrash
describes? Tanenbaum takes us to some of the levels below the odd enough
level of bus cycles and instruction prefetches. Chapter 3, "The Digital
Logic Level," is a superb examination of everything from NAND gates
to the construction of latches, flip-flops, and registers, up to memory
and buses. Furthermore, this is no abstract discussion of a hypothetical
machine. Throughout the book, Tanenbaum uses the Intel 80x86 and Motorola
680x0 families as his running examples. For example, this chapter contains
a discussion of the IBM PC and AT buses. The internal workings of "a
typical IBM PC clone" are described at the chip level, and a circuit
diagram is given and discussed at length.
Tanenbaum's book is based on "the idea that a computer can be regarded
as a hierarchy of levels" (p. xv). Furthermore, each level, even the
lowest "device level," corresponds to a language. "A central
theme of this book that will occur over and over again is: Hardware and
software are logically equivalent" (p. 11).
Chapter 4, "The Microprogramming Level," includes brief but useful
studies of the microarchitecture of the Intel and Motorola chips. Many PC
programmers will want to at least read the discussion (pp. 215-220) of the
Intel 8088 microcode. I've never seen this discussed anywhere else. Tanenbaum
also has brief, but useful coverage of the issues of instruction pipelining,
memory interface, and cache memory. I found myself wanting more on these
increasingly important topics. One good book is: High-Performance Computer
Architecture, Second Edition, by Harold S. Stone (Addison-Wesley, 1990).
One aspect of Tanenbaum's text that seems odd, at least with the benefit
of hindsight, is his choice of OS/2, rather than MS-DOS, as the archetypal
Intel operating system. True, "OS/2 has a surprisingly large number
of features that are not present in UNIX and are well worth examining."
But it makes no sense to write off MS-DOS with the comment that it is "an
obsolete, primitive, and not very interesting system, despite its widespread
use" (p. 372). Its widespread use is precisely what makes DOS intrinsically
interesting. To say that something is "of great commercial importance"
but "of little interest to us" (p. 373) seems like a bad way to
educate engineers! Since "the OS/2 designers were not permitted to
simply treat MS-DOS as a bad dream and start all over," it's not clear
why anyone else should pretend they have such a luxury. Oh, well.
But, like Tanenbaum's other books, Computer Networks and Operating Systems,
this one is nearly perfect.
80x86 Architecture and Programming
Finally, we come to Volume II of Rakesh Agarwal's 80x86 Architecture
and Programming. As an odd reversal to the natural order, Volume I apparently
won't be available for almost a year. Nonetheless, Volume II stands on its
own as an indispensible guide to the 80286, 80386, and 486 microprocessors.
In particular, Agarwal presents such a clear picture of the processors'
operation in protected mode that one could probably use his extensive C
code and diagrams to clone an Intel chip.
Agarwal presents extremely detailed C (and pseudo-C) code for each Intel
instruction. These in turn use a library of functions such as LA_rdChk()
(linear-address read), LA_wrChk() linear-address write), priv_lev_switch_CALL()
(privilege-level switch), enter_new_task(), and the sickeningly complex
read_descr() (read-descriptor).
The book also contains up-to-the-minute information on the 486 cache, hard-to-find
details on the floating-point exception/NMI interface (and how it had to
be faked on the 486!), a complete discussion of the undocumented LOADALL
instruction, and similar goodies. Unfortunately, the book did come out too
soon for inclusion of the deranged eight (count 'em) new address spaces
added on the 386SL Super-Set chips.
If you've ever asked what really happens when you MOV ES, AX in protected
mode, or how Windows 3.0 enhanced mode traps IN and OUT instructions using
Virtual 8086 mode, this is the book to get. When you're finished, you may
be sorry you asked, but that's a different story.
Zen of Assembly Language, Volume I: Knowledge
Michael Abrash
Glennview, Illinois: Scott, Foresman, 1990
849 pages, $29.95
ISBN 0-673-38602-3
Structured Computer Organization, Third Edition
Andrew S. Tanenbaum
Englewood Cliffs, NJ: Prentice-Hall, 1990
587 pages, $59.00
ISBN 0-13-854662-2
80x86 Architecture and Programming, Volume II: Architecture Reference
Rakesh K. Agarwal
Englewood Cliffs, NJ: Prentice-Hall, 1991
627 pages, $40.00
ISBN 0-13-245432-7
Electronic Review of Computer Books
Created 5/1/96 / Last modified 6/11/96 / webmaster@ercb.com