Journal:   Dr. Dobb's Journal  Oct 1992 v17 n10 p151(7)
-----------------------------------------------------------------------------
Title:     How to shear a sheep, and other texture-mapping niceties. (the
           importance of experience in graphics programming is shown through
           the development of a real-time three-dimensional animation
           program) (Graphics Programming)(Column) (Tutorial)
Author:    Abrash, Michael.
AttFile:    Program:  GP-OCT92.ASC  Source code listing.
            Program:  XSHARP21.ZIP  X-Sharp library for 3D graphics.

Abstract:  Experience is important in graphics programming because
           performance is so important, and performance programming is
           largely based on experience.  A real-time three-dimensional
           animation program is developed; animation depends on adequate
           speed to be convincing, but too much speed can destroy the
           illusion.  The shifting of images must be tuned so that the images
           are in the 'sweet spot' of apparent motion, in which the eye
           ignores the jumping and aliasing.  The importance of experience in
           programming is shown in the refinements made over a period of time
           to the X-Sharp 3-D animation package, to which texture mapping was
           added.  Enhancements to the texture-mapping code presented in a
           previous column are examined, and the code is listed; how to get
           X-Sharp is described.
-----------------------------------------------------------------------------
Descriptors..
Topic:     Animation
           Three-Dimensional Graphics
           Graphics Software
           Programming Instruction
           Program Development Techniques
           Tutorial
           Optimization
           Software Design
           Texture.
Feature:   illustration
           chart
           program.

-----------------------------------------------------------------------------
Full Text:

I recently spent an hour or so learning how to shear a sheep.  Among other
things, I learned--in great detail-about the importance of selecting the
proper comb for your shears, heard about the man who holds the world's record
for sheep sheared in a day (more than 600, if memory serves), and discovered,
Lord help me, the many and varied ways in which the New Zealand Sheep
Shearing Board improves the approved sheep-shearing method every year.  The
fellow giving the presentation did his best, but let's face it, sheep just
aren't very interesting.  If you have children, you'll know why I was there;
if you don't, there's no use explaining.

The chap doing the shearing did say one thing that stuck with me, although it
may not sound particularly profound.  (Actually, it sounds pretty silly, but
bear with me.) He said, "You don't get really good at sheep shearing for ten
years, or 10,000 sheep." I'll buy that.  In fact, to extend that morsel of
wisdom to the greater, non-ovine-centric universe, it actually takes a good
chunk of experience before you get good at anything worthwhile--especially
graphics, for a couple of reasons.  First, performance matters a lot in
graphics, and performance programming is largely a matter of experience.  You
can't speed up PC graphics simply by looking in a book for a better
algorithm; you have to understand the code C compilers generate, assembly
language optimization, VGA hardware, and the performance implications of
various graphics-programming approaches and algorithms.  Second, computer
graphics is a matter of illusion, of convincing the eye to see what you want
it to see, and that's very much a black art based on experience.

This month, experience figures into our current subject, real-time 3-D
animation, in several ways.  Stay tuned.

Visual Quality:.  A Black Hole...Er, Art

Pleasing the eye with real-time computer animation is something less than a
science, at least at the PC level, where there's no time for antialiasing and
a limited color palette; in fact, sometimes it can be more than a little
frustrating.  For example, last month I implemented texture mapping in
X-Sharp, the 3-D animation package that's an on-going  project in this
column.  My first implementation was disappointing; the texture maps shimmied
and sheared badly, like a loosely affiliated flock of pixels, each marching
to its own drummer.  Then I added a control key to speed up the roution; what
a difference! The aliasing problems were still there, but with the faster
rotation, the pixels moved too quickly for the eye to pick up on the
aliasing; the rotating texture maps, and the rotating ball as a whole,
crossed the threshold into being accepted by the eye as a viewed object,
rather than a collection of pixels.

The obvious lesson here is that adequate speed is important to convincing
animation.  There's another, less obvious side to this lesson though.  I'd
been running the texture-mapping demo on a 20-MHz 386 with a slow VGA when I
discovered the beneficial effects of greater speed.  When, some time later, I
ran the demo on a 33-MHz 486 with a fast VGA, I found that the faster
rotation was too fast! The ball spun so rapidly that the eye couldn't blend
successive images together into continuous motion, much like watching a badly
flickering movie.

So the second lesson is that either too little or too much speed can destroy
the illusion.  Unless you're antialiasing, you need to tune the shifting of
your images so that they're in the "sweet spot" of apparent motion, in which
the eye is willing to ignore the jumping and aliasing, and blend the images
together into continuous motion.  Only experience can give you a feel for
that sweet spot.

Fixed-point Arithmetic, Redux

Last month, I added texture mapping to X-Sharp, but lacked space to explain
some of the liner points.  This month, I'll cover some of those points, and
discuss the visual and performance enhancements I've added since last month.

In the very lust installment of this column, I spent a good bit of time
explaining exactly which pixels were inside a polygon and which were outside,
and how to draw those pixels accordingly.  This was important, I said,
because only with a precise, consistent way of defining inside and outside
would it be possible to draw adjacent polygons without either overlap or gaps
between them.

As a corollary, I added that only an all-integer, edge-stepping approach
would do for polygon filling.  Fixed-point arithmetic, although alluring for
speed and ease of use, would be unacceptable because round-off error would
resuit in imprecise pixel placement.

More than a year then passed, during which time my long-term memory
apparently suffered at least partial failure.  When I went to implement
texture mapping last month, I decided that since transformed destination
vertices can fall at fractional pixel locations, the cleanest way to do the
texture mapping would be to use fixed-point coordinates for both the source
texture and the destination screen polygon.  That way, there would a minimum
of distortion as the polygon rotated and moved.  Theoretically, that made
sense; but there was one small problem: gaps between polygons.

Yes, folks, I had ignored the voice of experience (my own voice, at that.) at
my own peril.  You can be assured I will not forget this particular lesson
again: Fixed-point arithmetic is notprecise.  That's not to say that it's
impossible to use fixed-point for drawing polygons; if all adjacent edges
share common start and end vertices and common edges are always stepped in
the same direction, then all polygons should share the same fixed-point
imprecision, and edges should fit properly (although polygons may not include
exactly the right pixels).  What you absolutely cannot do is mix fixedpoint
and all-integer polygon-filling approaches when drawing, as shown in Figure
1.  Consequently, I ended up using an all-integer approach in X-Sharp for
stepping through the destination polygon.  However, I kept the fixedpoint
approach, which is faster and much simpler, for stepping through the source.
Why was it all right to mix approaches in this case? Precise pixel placement
only matters when drawing, because otherwise we can get gaps, which are very
visible.  When selecting a plxel to copy from the source texture, however,
the worst that happens is that we pick the source pixel next to the one we
really want, causing the mapped texture to appear to have shifted by one
pixel at the corresponding destination pixel; given all the aliasing and
shearing already going on in the texture-mapping process, a one-pixel mapping
error is insignificant.

Experience again: knowing which flaws (like small texture shifts) can
reasonably be ignored, and which (like those that produce gaps between
polygons) must be avoided at all costs.

Texture Mapping: Orientation Independence

Last month's double-DDA texture-mapping code worked adequately, but there
were two things about it that left me less than satisfied.  One flaw was
performance; that's addressed shortly.  The other flaw was the way textures
shifted noticeably as the orientations of the polygons they were mapped onto
changed..

Last month's code followed the standard polygon inside/outside rule for
determining which pixels in the source texture map were to be mapped: Pixels
that mapped exactly to the left and top destination edges were considered to
be inside, and pixels that mapped exactly to the right and bottom destination
edges were considered to be outside.  That's me for filling polygons, but
when copying texture maps, it causes different edges of the texture map to be
omitted, depending on the destination orientation, because different edges of
the texture map correspond to the right and bottom destination edges,
depending on the current rotation.  Also, last month's code truncated to get
integer source coordinates.  This, together with the orientation problem,
meant that when a texture turned upside down, it gained one extra row and one
extra column of pixels from the next row and column of the texture map.  This
asymmetry was quite visible, and not at all the desired effect.

Listing One (page 164) is one solution to these problems.  This code, which
replaces the equivalently named function from last month, makes no attempt to
follow the standard polygon inside/outside rules when mapping the source.
Instead, it advances a half-step into the texture map before drawing the
first pixel, so pixels along all edges are half included.  Rounding rather
than truncation to texture-map coordinates is also performed.  The result is
that the texture map stays pretty much centered within the destination
polygon as the destination rotates, with a much-reduced level of
orientation-dependent asymmetry.

Mapping Textures Across Multiple Poly One of the truly nifty things about
double-DDA texture mapping is that it is not limited to mapping a texture
onto a single polygon.  A single texture can be mapped across any number of
adjacent polygons simply by having polygons that share vertices in 3-space
also share vertices in the texture map.  In fact, the demonstration program
DEMO 1 in the X-Sharp archive maps a single texture across two polygons; this
is the blue-on-green pattern that stretches across two panels of the spinning
ball.  This capability makes it easy to produce polygon-based objects with
complex surfaces (such as banding and insignia on a spaceship).  Just map the
desired texture onto the underlying polygonal framework of an object, and let
doubleDDA texture mapping do the rest.

Fast Texture Mapping

Of course, there's a problem with mapping a texture across many polygons:
Texture mapping is slow.  If you run DEMO1 and move the ball up close to the
screen, you'll see that the ball slows considerably whenever a texture swings
around into view.  To some extent that can't be helped, because each pixel of
a texture-mapped polygon has to be calculated and drawn independently.
Nonetheless, we can certainly improve the performance of texture mapping a
good deal over last month.

By and large, there are two keys to improving PC graphics performance.  The
first--no surprise--is assembly language.  The second, without which assembly
language is far less effective, is understanding exactly where the cycles go
in inner loops; in our case, that means understanding where the bottlenecks
are in Listing One.

Listing Two (page 164) is a high-performance assembly language implementation
of Listing One.  Apart from the conversion to assembly language, this
implementation improves performance by focusing on reducing innerloop
bottlenecks.  In fact, the whole of Listing Two is nothing more than the
inner loop for texture-mapped polygon drawing; Listing Two is only the code
to draw a single scan line.  Most of the work in drawing a texture-mapped
polygon comes in scanning out individual lines, though, so this is the
appropriate place to optimize.

Within Listing Two, all the important optimization is in the loop that draws
across each destination scan line, near the end of the listing.  One
optimization is elimination of the call to the set-pixel routine used to draw
each pixel in Listing One.  Function calls are expensive operations, to be
avoided when performance matters.  Also, although mode X (the undocumented
320x240 256-co1or VGA mode X-Sharp runs in) doesn't lend itself well to
pixel-oriented operations like line drawing or texture mapping, the inner
loop has been set up to minimize mode X's overhead.  A rotating plane mask is
maintained in AL, with DX pointing to the Map Mask register; thus, only a
rotate and an OUT are required to select the plane to which to write, cycling
from plane 0 through plane 3 and wrapping back to O. Better yet, because we
know that we're simply stepping horizontally across the destination scan
line, we can use a clever optimization to both step the destination and
reduce the overhead of maintaining the mask.  Two copies of the current plane
mask are maintained, one in each nibble of AL.  (The Map Mask register pays
attention only to the lower nibble.) Then, when one copy rotates out of the
lower nibble, the other copy rotates into the lower nibble and is ready to be
used.  This approach eliminates the need to test for the mask wrapping from
plane 3 to plane O, all the more so because a carry is generated when
wrapping occurs, and that carry can be added to DI to advance the screen
pointer.

In all, the overhead of drawing each pixel is reduced from a call to the
setpixel routine and full calculation of the screen address and plane mask to
five instructions and no branches.  This is an excellent example of
converting full, from-scratch calculations to incremental processing, whereby
only information that has changed since the last operation (the plane mask
moving one pixel, for example) is recalculated.

Incremental processing and knowing where the cycles go are both important in
the final optimization in Listing Two, speeding up the retrieval of pixels
from the texture map.  This operation looks very efficient in Listing One,
consisting of only two adds and the macro GET_ IMAGE_PIXEL.  However, those
adds are fixed-point adds, so they take four instructions apiece, and the
macro hides not only conversion from fixed-point to integer, but also a
time-consuming multiplication.  Incremental approaches are excellent at
avoiding multiplication, because cumulative additions can often replace
multiplication.  That's the case with stepping through the source texture in
Listing Two; ten instructions, with a maximum of two branches, replace all
the texture calculations of Listing One.  Listing Two simply detects when the
fractional part of the source x or y coordinate turns over and advances the
source texture pointer accordingly.

As you might expect, all this optimization is pretty hard to implement, and
makes Listing Two much more complicated than Listing One.  Is it worth the
trouble? Indeed it is.  Listing Two is more than twice as fast as Listing
One, and the difference is very noticeable when large, texture-mapped areas
are animated.  Whether more than doubling performance is significant is a
matter of opinion, I suppose, but imagine that you're in William Gibson's
Neuromancer, trying to crack a corporate database.  Which texture-mapping
routine would you rather have interfacing you to Cyberspace?

Where to Get X-Sharp

The full source for X-Sharp is available in the file XSHRPn.  ZIP in the DDJ
Forum on CompuServe, and as XSHARPn .ZIP in the programming/graphics
conference on M&T Online and the graphic.disp conference on Bix.  (XSHARP21
is the first version that includes fast, assembly language texture mapping.)
Alternatively, you can send me a 360K or 720K formatted diskette and an
addressed, stamped diskette mailer, care of DDJ, 411 Borel Ave., San Mateo,
CA 94402; and I'll send you the latest copy of X-Sharp.  There's no charge,
but it'd be very much appreciated if you'd slip in a dollar or so to help out
the folks at the Vermont Association for the Blind and Visually Impaired.

I'm available on a daily basis to discuss X-Sharp on M&T Online and Bix; my
user name is mabrash in both cases.  There is no truth to the rumor that I
can be reached under the alias "sheep-shearer,"  at least not for another
9999 sheep.