Journal: Dr. Dobb's Journal June 1992 v17 n6 p139(7) ----------------------------------------------------------------------------- Title: Fast antialiasing. (Column) Author: Abrash, Michael. AttFile: Program: GP-JUN92.ASC Source code listing. Abstract: Advanced computer graphics techniques require powerful computers and dedicated systems to handle the output, but utility microcomputers based on Intel's 80386 and 80486 microprocessors become burdened when trying to compute such graphics features as anti aliasing. Anti aliasing involves smoothing lines and edges to remove the jagged appearance. Anti aliasing makes computer images more attractive and aids in making technical drawings more accurate. An algorithm developed by Xiaolin Wu describing line anti aliasing is described. The algorithm brackets either side of a line's pixel with pixels of differing values of color. The sum intensity of the bordering pixels equals the intensity of the line's pixel. ----------------------------------------------------------------------------- Descriptors.. Topic: Tutorial Computer Graphics Programming Program Development Techniques Rendering Dithering Algorithms. Feature: illustration chart program. ----------------------------------------------------------------------------- Full Text: The thought first popped into my head as I unenthusiastically picked through the salad bar at a local "family" restaurant, trying to decide whether the meatballs, the fried clams, or the lasagna was likely to shorten my life the least. I decided on the chicken in mystery sauce. The thought recurred when my daughter asked, "Dad, is that fried chicken?" "I don't think so," I said. "I think it's stewed chicken." "It looks like fried chicken." "Maybe it's fried, stewed chicken," my wife volunteered hopefully. I took a bite. It was, indeed, fried, stewed chicken. I can now, unhesitatingly and without reservation, recommend that you avoid fried, stewed chicken at all costs. The thought I had was as follows: This is not good food. Not a profound thought, but it raises an interesting question: Why was I eating in this restaurant? The answer, to borrow a phrase from E.F. Schumacher, is appropriate technology. For a family on a budget, with a small child, tired of starting at each other over the kitchen table, this was a perfect place to eat. It was cheap, it had greasy food and ice cream, no one cared if children dropped things or talked loudly or walked around, and most important of all, it wasn't home. So what if the food was lousy? Good food was a luxury, a bonus; everything on the above list was necessary. A family restaurant was the appropriate dining-out technology, given the parameters within which we had to work. When I read through SIGGRAPH proceedings and other state-of-the-art computer-graphics material, all too often I feel like I'm dining at a four-star restaurant with two-year-old triplets and an empty wallet. We're talking incredibly inappropriate technology for PC graphics here. Sure, I say to myself as I read about an antialiasing technique, that sounds wonderful--if I had 24-bpp color, and dedicated hardware to do the processing, and all day to wait to generate one image. Yes, I think, that is a good way to do hidden surface removal--in a system with hardware z-buffering. Most of the stuff in Computer Graphics is riveting, but, alas, pretty much useless on PCs. When an 80x86 has to do all the work, speed becomes the overriding parameter, especially for real-time graphics. Literature that's applicable to fast PC graphics is hard enough to find, but what we'd really like is above-average image quality combined with terrific speed, and there's almost no literature of that sort around. There is some, though, and you folks are right on top of it. For example, alert reader Michael Chaplin, of San Diego, wrote to suggest that I might enjoy the line-antialiasing algorithm presented in Xiaolin Wu's article, "An Efficient Antialiasing Technique," in the July 1991 issue of Computer Graphics. Michael was dead-on right. This is a great algorithm, combining excellent antialiased line quality with speed that's close to that of non-antialiased Bresenham's line drawing. This is the sort of algorithm that makes you want to go out and write a wireframe animation program, just so you can see how good those smooth lines look in motion. Wu antialiasing is a wonderful example of what can be accomplished on inexpensive, mass-market hardware with the proper programming perspective. In sort, it's a splendid example of appropriate technology for PCs. Wu Antialiasing To recap briefly, antialiasing is the process of smoothing lines and edges so that they appear less jagged. Antialiasing is partly an aesthetic issue, because it makes images more attractive. It's partly an accuracy issue, because it makes it possible to position and draw images with effectively more precision than the resolution of the display. Finally, it's partly a flat-out necessity, to avoid the horrible, crawling, jagged edges of temporal aliasing when performing animation. The basic premise of Wu antialiasing is almost ridiculously simple: As the algorithm steps one pixed unit at a time along the major (longer) axis of a line, it draws the two pixels bracketing the line along the minor acis at each point. Each of the two bracketing pixels is drawn with a weighted fraction of the full intensity of the drawing color, with the weighting for each pixel equal to one minus the pixel's distance along the minor axis from the ideal line. Figure 1 illustrates this concept. The intensities of the two pixels that bracket the line are selected so that they always sum to exactly 1; that is, to the intensity of one fully illuminated pixel of the drawing color. The presence of aggregate full-pixel intensity means that at each step, the line has the same brightness it would have if a single pixel were drawn at precisely the correct location. Moreover, thanks to the distribution of the intensity weighting, that brightness is centered at the ideal line. Not coincidentally, a line drawn with pixel pairs of aggregate single-pixel intensity, centered on the ideal line, is perceived by the eye not as a jagged collection of pixel pairs, but as a smooth line centered on the ideal line. Thus, by weighting the bracketing pixels properly at each step, we can readily produce what looks like a smooth line at precisely the right location, rather than the jagged pattern of line segments that non-antialiased line-drawing algorithms such as Bresenham's trace out. You might expect that the implementation of Wu antialiasing would fall into two distinct areas: tracing out the line (that is, finding the appropriate pixel pairs to draw) and calculating the appropriate weightings for each pixel pair. Not so, however. The weighting calculations involved only a few shifts, XORs, and adds; for all practical purposes, tracing and weighting are rolled into one step--and a very fast step it is. How fast is it? On a 33-MHz 486 with a fast VGA, a good but not maxed-out assembler implementation of Wu antialiasing draws a more than respectable 5000 150-pixel-long vectors per second. That's especially impressive considering that about 1,500,000 actual pixels are drawn per second, meaning that Wu antialiasing is drawing at over 50 percent of the maximum memory bandwidth--and hence more than half the faster theoretically possible drawing speed--of an AT-bus VGA. In short, Wu antialiasing is about as fast an antialiased line approach as you could ever hope to find for the VGA. Tracing and Intensity in One Horizontal, vertical, and diagonal lines do not require Wu antialiasing because they pass through the center of every pixel they meet; such lines can be drawn with fast, special-case code. For all other cases, Wu lines are traced out one step at a time along the major axis by means of a simple, fixed-point algorithm. The move along the minor axis with respect to a one-pixel move along the major axis (the line slope for lines with slopes less than 1, 1/slope for lines with slopes greater than 1) is calculated with a single integer divide. This value, called the "error adjust," is stored as a fixed-point fraction, in 0.16 format (that is, all bits are fractional, and the decimal point is just to the left of bit 15). An error accumulator, also in 0.16 format, is initialized to 0. Then the first pixel is drawn; no weighting is needed, because the line intersects its endpoints exactly. Now the error adjust is added to the error accumulator. The error accumulator indicates how far between pixels the line has progressed along the minor axis at any given step; when the error accumulator turns over, it's time to advance one pixel along the minor axis. At each step along the line, the major-axis coordinate advances by one pixel. The two bracketing pixels to draw are simply the two pixels nearest the line along the minor axis. For instance, if X is the current major-axis coordinate and Y is the current minor-axis coordinate, then the two pixels to be drawn are (X,Y) and (X,Y + 1). In short, the derivation of the pixels at which to draw involves nothing more complicated than advancing one pixel along the major axis, adding the error adjust to the error accumulator, and advancing one pixel along the minor axis when the error accumulator turns over. So far, nothing special; but now we come to the true wonder of Wu antialiasing. We know which pair of pixels to draw at each step along the line, but we also need to generate the two proper intensities, which must be inversely proportional to distance from the ideal line and sum to 1, and that's a potentially time-consuming operation. Let's assume, however, that the number of possible intensity levels to be used for weighting is the value NumLevels = [2.sup.n] for some integer n, with the minimum weighting (0 percent intensity) being the value [2.sup.n-1], and the maximum weighting (100 percent intensity) being the value 0. Given that, lo and behold, the most significant n bits of the error accumulator select the proper intensity value for one element of the pixel pair, as shown in Figure 2. Better yet, [2.sup.n-1] minus the intensity of the first pixel selects the intensity of the other pixel in the pair, because the intensities of the two pixels must sum to 1; as it happens, this result can be obtained simply by flipping the n least-significant bits of the first pixel's value. All this works because what the error accumulator accumulates is precisely the ideal line's current distance between the two bracketing pixels. The intensity calculations take longer to describe than they do to perform. All that's involved is a shift of the error accumulator to right-justify the desired intensity weighting bits, and then an XOR to flip the least-significant n bits of the first pixel's value in order to generate the second pixel's value. Listing One (page 154) illustrates just how efficient Wu antialiasing is; the intensity calculations take only three statements; and the entire Wu line-drawing loop is only nine statements long. Of course, a single C statement can hide a great deal of complexity, but Listing Six (page 156), the inner loop of an assembler implementation, shows that only 15 instructions are required per step along the major axis--and the number of instructions could be reduced to ten by special-casing and loop unrolling. Make no mistake about it, Wu antialiasing is fast. Sample Wu Antialiasing The true test of any antialiasing technique is how good it looks, so lt's have a look at Wu antialiasing in action. Listing One is a C implementation of Wu antialiasing. Listing Two (page 155) is a sample program that draws a variety of Wu-antialiasing lines, followed by non-antialiased lines, for comparison. Listing Three (page 156) contains DrawPixel() and SetMode() functions for mode 13h, the VGA's 320x200 256-color mode. Finally, Listing Four (page 156) is a simple, non-antialiased line-drawing routine. Link these four listings together and run the resulting program to see both Wu-antialiased and non-antialiased lines. Listing One isn't particularly fast, because it calls DrawPixel() for each pixel. On the other hand, DrawPixel() makes it easy to try out Wu antialiasing in a variety of modes; just adapt the code in Listing Three for the 256-color mode you want to support. For example, Listing Five (page 156) shows code to draw Wu-antialiased lines in 640x480 256-color mode on a Super-VGA built around the Tseng Labs ET4000 chip with at least 512K of display memory installed. It's well worth checking out Wu antialiasing at 640x480. Although antialiased lines look much smoother than normal lines at 320 x 200 resolution, they're far from perfect, because the pixels are so big that the eye can't blend them properly. At 640x480, however, Wu-antialiased lines look fabulous; from a couple of feet away, they look as straight and smooth as if they were drawn with a ruler. Listing One requires that the DAC palette be set up so that a NumLevel-long block of palette entries contains linearly decreasing intensities of the drawing color. The size of the block is programmable, but must be a power of two. The more intensity levels, the better. Wu says that 32 intensities is enough; on my system, eight and even four levels looked pretty good. I found that gamma correction, which gives linearly spaced intensity steps, improved antialiasing quality significantly. Fortunately, we can program the palette with gamma-corrected values, so our drawing code doesnt's have to do eny extra work. Listing One isn't very fast, so I implemented Wu antialiasing in assembler, hard-coded for mode 13h. The inner loop of the assmbler code is shown in Listing Six; the full assembler routine will be available as part of the code archive from this issue of DDJ. (See "Availability" on page 3.) High-speed graphics code and fast VGAs go together like peanut butter and jelly, which is to say very well indeed; the assembler implementation ran more than twice as fast on my 486 after I ran the SETBUS utility from last month to put the VGA into 16-bit mode. Enough said! Notes on Wu Antialiasing Wu antialiasing can be applied to any curve for which it's possible to calculate at each step the positions and intensities of two bracketing pixels, although the implementation will generally be nowhere near as efficient as it is for lines. However, Wu's anticle in Computer Graphics does describe an efficient algorithm for drawing antialiased circles. Wu also describes a technique for antialiasing solids, such as filled circles anf polygons. Wu's approach biases the edges of filled objects outward. Although this is no good for adjacent polygons of the sort used in rendering, it's certainly possible to design a more accurate polygon-antialiasing approach around Wu's basic weighting technique. The results would not be quite so good as more sophisticated antialiasing techniques, but they woudl be much faster. In general, in fact, the results obtained by Wu antialiasing are only so-so, by theoretical measures. Wu antialiasing amounts to a simple box filter placed over a step approximation of a line, and that process introduces a good deal of deviation from the ideal. On the other hand, Wu notes that even a 10 percent error in intensity doesn't lead to noticeable loss of image quality, and for Wu-antialiased lines up to 1K pixels in length, the error is under 10 percent. If it looks good, it is good--and it looks good. With a 16-bit error accumulator, fixed-point inaccuracy becomes a problem for Wu-antialiased lines longer than 1K. For such lines, you should switch to using 32-bit error values, which would let you handle lines of any practical length. In the listings, I have chosen to truncate, rather than round, the error-adjust value. This increases the intensity error of the line but guarantees that fixed-point inaccuracy won't cause the minor axis to advance past the endpoint. Over-running the endpoint would result in the drawing of pixels outside the line's bounding box, and potentially even in an attempt to access pixels off the edge of the bitmap. Finally, I should mention tha, as published, Wu's algorithm draws lines symmetrically, from both ends at once. I haven't done this for a number of reasons, not least of which is that symmetric drawing is an inefficient way to draw lines that span banks on banked Super-VGAs. Banking aside, however, symmetric drawing is potentially faster, because it eliminates half of all calculations; in so doing, it cuts intensity error in half, as well. Matrix Orthogonalization Made Easy Peter Brooks of MicroMind passed along the following nifty idea. Fixed-point rotation matrices tend to drift from the proper values over the course of repeated concatenations, due to fixed-point error, with the colums gradually ceasing to be mutually perpendicular (orthogonal). It is essential that rotation matrices remain orthogonal, however, in order to perform nondistorting rotations. One way to deal with this is to periodically recalculate one of the column as the cross-product of the other two, which works because the cross-product of two vectors is an orthogonal vector. Peter's insight was: Why not recalculate one colum as the cross-product of the other two every time you recalculate a rotation matrix? In other words, do the matrix multiplication to generate two columns of the result matrix, then just calculate the third column as the cross-product of the other two. The row or column calculated as the cross-product should be rotated regularly. Not only does this guarantee an orthogonal matrix at all times, but it also turns out to be faster, because it takes fewer operations to calculate a cross-product than to recalculate a matrix column. (All of the above applies equally well if you substitute "row" for "column.") This sounds like a great idea to me. My only question is whether one should alternate between rows and columns, in order to distribute error more evenly. Any thoughts, readers? Next Time We didn't quite make it back to X-Sharp this month, although the topics were certainly related. Next month, for sure. In the meantime, keep those excellent suggestions coming. And stay the heck away from fried, stewed chicken.