Beginners Guide to PowerPC assembly language

PowerPC assembly language beginners guide.

Chapter 5

The Macintosh

This is the trickiest chapter by far. I will attempt to explain just what a Macintosh is (from a programmers perspective), how it is programmed from a low level point of view and where some potential problems can occur.

The Macintosh, or more lovingly, the Mac, debuted way back round about 1984 (which is about two centuries in computing time). At the time it was heralded as a complete and utter breakthrough. Today, it can be classified as a high powered, multi-media workhorse. Some would argue it is not a multi-tasking system, but from a programmers point of view it is very multi-tasking. The kind of multi-tasking I am talking about is that of being able to run multiple processes sharing a finite amount of hardware resources. For this very reason, even a game can't just ditch the OS and get on with it. For example; you may grab a serial port and start sending data. But if another process running in the background asks for use of the same serial port, the OS has no way of knowing you have grabbed it. The result is garbled data. With the advent of real time systems such as Open Transport Networking and multithreaded processes, even an assembly language programmer has to follow the rules. Granted that an assembly language programmer can do things a C programmer simply can't. Even in this day and age, certain parts of the OS can only be talked to with some assembly language "glue". Of course, the assembly language programmer doesn't need this glue, and so gets faster execution.

For the reasons outlined above, you need to know how to talk to to the OS, and what facilities it provides. Fortunately under PowerPC, accessing the OS is far simpler than it is for 68K, where you normally push parameters on the stack, but not always. A case in point is the memory manager when called from 68K code. It generally takes a pointer in a0, and any size data in d0. In PPC, all parameters are passed in registers. The first integer parameter goes in r3, the next in r4 etc. Floating point parameters are passed in f1 onwards. Any budding Mac programmer NEEDS to get hold of Inside Mac Toolbox essentials and Overview. These are totally necessary reading. Believe me, if you break the rules on the Mac, even though your program may run, your users will give you a very hard time :-)

Good, that's the OS introduction out of the way. From the above you will hopefully see why an understanding of the OS is necessary. So what is the MacOS? As noted previously, the Mac has a long and somewhat uneven heritage. Many projects have been started at Apple and only half heartedly incorporated into the OS (PowerTalk, GX printing etc.). Many other projects have been highly succesful and become "core" - for example the Sound Manager. Thus we have an OS with many "core" components and many "not so core" items. This really doesn't matter to us. What technologies you decide to use is up to you. The important thing is that you do need to use some of the OS. Why? There have been many different motherboards, processors and chips used in the Mac over the years. It's not like the Amiga with a standard chipset - you simply don't know what motherboard your pride and joy will end up running on. This may not sound like a whole load of fun, but don't worry - you need to use the OS yes, but you don't need to do everything through it. I know you may be thinking calling the OS is slow because of all those parameters you have to pass and set up. True. The trick is to use the OS just where you really need to, or at least to use the OS to get the machine into the state you need it. Just bear in mind that it is a multitasking environment and all will be fine.

Now to make this easier for all of us, I'm going to show you two ways of calling the OS, and from then on, use the second. First the underlying theory.

On the PowerMac, all OS functions are exported from fragments (typically shared libraries). This means that your PowerPC program needs to be linked (at run time) with the OS exports. Then to call an OS function, your code has to set up the TOC to that of the shared library, find the address of the function, and then branch to it. Of course, it also has to set up a stack frame, and save some important registers (like our toc for example!). Here's the base, raw code to do it:

	lwz	r12,the_function(rtoc)	*load transition vector
	stw	R2,20(sp)	*save my RTOC
	lwz	r0,0(r12) 	*get callee address
	mtctr	r0		*prepare branch
	lwz	R2,4(r12)	*set callee RTOC
	bctrl			*bsr to callee

	lwz	r2,20(sp)	*get my toc back

Replace the text "the_function" with whatever OS function you are calling. The name of the function must have previously been imported via Fantasms import directive. So, let's take BlockMove as an example - the code to call BlockMove (moves a chunk of memory) would be:

	ifnd	BlockMove	;if BlockMove hasn't been defined
	import	BlockMove	;import it
	endif				;of import check
	lwz	r12,BlockMove(rtoc)	*load transition vector
	stw	R2,20(sp)	*save my RTOC
	lwz	r0,0(r12) 	*get callee address
	mtctr	r0		*prepare branch
	lwz	R2,4(r12)	*set callee RTOC
	bctrl			*bsr to callee

	lwz	r2,20(sp)	*get my toc back

BlockMove is defined as:

BlockMove(srcPtr, destPtr, byteCount);

So, we move our srcptr into r3 (where the data is now), we load r4 with our destptr (where we want the data copied to) and we load r5 with the number of bytes to copy. (As you can see, BlockMove would be better called "BlockCopy" but there you go...).

So, if we wanted to move 1000 bytes from fred to harry the complete code would be:

	lwz	r3,fred(rtoc)	*ptr to fred
	lwz	r4,harry(rtoc)	*ptr to harry
	li	r5,1000			*1000 bytes to copy
	ifnd	BlockMove	;if BlockMove hasn't been defined
	import	BlockMove	;import it
	endif				;of import check
	lwz	r12,BlockMove(rtoc)	*load transition vector
	stw	R2,20(sp)	*save my RTOC
	lwz	r0,0(r12) 	*get address of BlockMove
	mtctr	r0		*prepare branch
	lwz	R2,4(r12)	*set BlockMove's RTOC
	bctrl			*branch and link to BlockMove

	lwz	r2,20(sp)	*get my toc back
	carry on with your code
fred:	ds.b	1000
harry:	ds.b	1000

As you can see, that's a whole lot of code to type every time we want to call something in the OS! It's also very error prone. So we don't do it that way. You can if you want, but it's not recommended - I just wanted to show you the mechanics.

We roll all the common code into a macro, and use that instead:-

	lwz	r3,fred(rtoc)	*ptr to fred
	lwz	r4,harry(rtoc)	*ptr to harry
	li	r5,1000		*1000 bytes to copy
	Xcall	BlockMove

Easier? Course it is. The only requisite is that you must include the right .def file into your project either as a globinc, or include the file into your source file with an includeh directive. If you look in the Anvil folder "Anvil low level defs" you'll find lots of these files. How do you find the right file? Easy; use Anvil's search all files in folder feature. Open any file from the low level defs folder with Anvil. Now bring up the find dialog and type in "blockmove" in the find field, next click the "Search all files in folder" check box and then click OK. Within a few seconds, Anvil will have found the file that contains BlockMove; in this case, "memory.def". Def files are Fantasm's low level equivalent of C's header files.

Fantasm 5.1 expands on this concept by removing the need to include the low level def file through the use of zillions of macros - one for each OS function. In the above case, blockmove would be called as:

	lwz	r3,fred(rtoc)	*ptr to fred
	lwz	r4,harry(rtoc)	*ptr to harry
	li	r5,1000		*1000 bytes to copy
	OSBlockMove	r3,r4,r5

It doesn't look much different right? True, but what you get is error checking in that the macro knows how many parameters to expect, and so can fail if you pass not enough or too many parameters. Also it means you don't have to get the parameters in the right registers

OSBlockMove r3,r7,r8

Would be perfectly acceptable. The macros also provide the same facilities under 68K assembly language too, so you may want to take a look at this when 5.1 is published.

Note about BlockMove - if you only have a small amount of data to copy, do it yourself. By the time you've called BlockMove, you could've done it already. For large amounts of data, BlockMove is hard to beat.

Now we know how to call the OS. Good. You are now asking yourself "I wonder how many functions are in the OS?" The answer is "Thousands" and I can pre-empt your next question "So how do I find out which one I need?" with the following answer.
Download Inside Mac for starters. They are all free from Apple (go to our links page where you'll find links to them). At the back of each chapter is a list of functions described in that chapter along with the parameters. Alternatively, all good booksellers will sell you the volumes in paper form or you can get them all on CD from Apple. If you are really serious about it you may want to buy either Think Reference from MacTech (which is the one we use) or Macintosh Toolbox Assistant from Apple ($89 last time I checked). You just type in the first couple of letters of the function you are interested in and it'll find the function for you. You'll get a description along with things to watch for and of course the required parameters and any return data.

So, what kind of OS functions are you likely to need? Well first off, any application HAS to do certain initialization, otherwise you'll call an OS function and it'll crash. Luckily, to save you time, a library function called InitMac is supplied. Call this at startup and you have no problems.

bl init_mac

Make sure you have added the Application library to your project, else it won't link (it'll fail with a "Where is this InitMac??" error (or words to that affect)).

After this, you are in a position to get on with you program. Nearly every type of program must have a window to draw into. You can't just draw "anywhere". You must draw into a graphic port. More often than not these days, you want to use color, so it needs to be a color graphic port - a CGrafPort. Windows can be created in one of two ways - either manually or you can get one from a resource. Either way is fine - you end up with the same result. From a resource is the quickest and least code intensive of the two. If you open Resedit, create a new file and then create a "WIND" resource, you'll find you can edit your window graphically. Be sure the initially visible checkbox is set and leave the ID at 128 for the purposes of this example. If you now save the file and then add it to your Anvil project; the window you have defined will be copied to your application when built.

The function we need to use to get the window is called GetNewCWindow and looks like this:

pCWindow = GetNewCWindow(windowID, wStorage, behind);
windowID is the resource ID of the window. 128 in our case.
wStorage. You can tell the window manager where to put the window record, or if you set it to null (0) the storage will be allocated for you.
behind is the windowptr (what this call returns) of the window you want this window to appear behind. If you set it to -1, the window is placed at the front.

Now we can load the window with the folllowing code:

	li	r3,128	*the window ID in the resource fork
	li	r4,0	*Let the OS allocate storage for it.
	li	r5,-1	*We want it to the front
	Xcall	GetNewCWindow

After this code, r3 will either contain a valid pointer to the window or zero if the call failed for some reason. You should then store the ptr to the window somewhere safe, such as in a global variable. Note that this call gets the window, but does not set the port. So the next thing we need to do is set the current port.

	cmpwi	r3,0
	stw	r3,my_window_ptr(`bss)
	beq	error

Note the optimization of storing the window_ptr during the check for a valid pointer. This means that we can store a zero as the ptr (if there was an error), but heck, if we have an error anyway it doesn't really matter. The ptr to the window actually points to the graphics port (CGrafPort) for this window, so if we pass that to SetPort we then have a valid drawing environment:

	lwz	r3,my_window_ptr(`bss)
	Xcall	SetPort

And to prove it, we can now print something:

	lwz	r3,my_string(rtoc)	;a pascal type string - pstring directive
	Xcall	DrawString

Finally, we can wait for the mouse button to be pressed before quitting:

wait:	Xcall	Button
	cmpwi	r3,0
	beq		wait

And to quit, we call the macro "tidy_up" (assuming "start_up" was called at the start).

error:	tidy_up

If you were to run this program, you'd get a window, and it would quit when you pressed the mouse button, but you wouldn't see any text. Why? Well, we haven't told the OS where we want to draw, so it does it at 0,0 (x,y coords). Because text is printed from the bottom up, our text is drawn out of view at the top of the window. We need to move the drawing coordinates (or the pen position) to a suitable location before drawing:

	li	r3,4	*X coordinate
	li	r4,20	*Y coordinate
	Xcall	MoveTo

Now, we will see the text.

Looping

One of the things you need to do in any programming language is change the flow of instructions depending on the result of an operation. Consider the change to our program below:

**lets set up our x and y coordinate variables
	li	r23,4	*x
	li	r24,20	*y
**lets set up our loop counter
	li	r25,10
**lets draw some text
draw_loop:
	mr	r3,r23	*x
	mr	r4,r24	*y
	Xcall	MoveTo	*move the pen
	lwz	r3,my_text(rtoc)
	Xcall	DrawString	*Draw the string
	subic.	r25,r25,1	*decrement loop counter 
	addi	r23,r23,4	*increment x coordinate
	addi	r24,r24,4	*increment y coordinate
	bne	draw_loop		*if our loop counter isn't zero, goto draw_loop

Can you visualize the result of this? Whilst thinking about it, note again, I have placed two instructions at the bottom between the subtract with record instruction (subic.) and the conditional branch (bne). The adds in-between these two instructions are effectively "free". The processor would take five cycles to determine whether to take the branch or not, so by opting for a subic. rather than using the counter register (which may at first seem the obvious choice) to control the loop, I have incremented my printing coords "for free". Any integer instruction with a dot after it will affect the condition flags field 0. Any conditional branch without a cr field assumes cr0. We could write bne cr0,draw_loop and it would mean the same thing.

Note that a straight subi can't be used to set the cr0 field of the condition code register. One must use a subic.

Anyway, back to the program - it prints ten strings, each offset in x and y slightly. OK, cool. Now how about we want it continually printing these strings until we press the mouse button. What we need to do is: after printing the strings, erase them, check the mouse, and if not pressed reprint them, erase them, check the mouse, etc.

How do we erase them? Well the obvious choice is the OS function EraseRect. It takes a rectangle defined as four 16 bit values of top, left, bottom, right. This definition is pretty much a Mac standard as far as rectangles go, so you may as well get it into your head now. Top, left, bottom, right. You just need to learn it.

Now we could say, OK, I know the size of that window so I can define the rectangle to erase as a constant set of data with a dc.h directive. Wrong :-)Yes, you can do this, but what happens if the user changes the size of your window? If you remember back a few paragraphs, I said that a windowptr is really just a CGrafPort pointer. If we look at the CGrafPort structure, we can find something that is useful. Actually, something that may be of benefit here is to examine just how we find out about a given structure. In this case we are talking graphics. This menas that books like Inside Mac Memory probably won't help much. Inside Mac Toolbox Essentials might as it covers all the really important things, and a graphic ports are pretty important. So, I load up my Inside Mac CD (you can buy them on CD, or if you have a "burner" you can make your own IM CD after downloading them from Apple. Tip, just write them as a session, that way you can add them to your CD as they are published/updated. So, I load in my IM CD and open TB Essentials.

I go to the WindowManager section and find the defintion of a CWindowRecord. Sure enough, the first entry is the graphics port, but this document doesn't expand on the graphics port structure. It does however tell me that the structure is defined in Inside Macintosh: Imaging. So, now I load that one up, goto the table of contents and immediately find the definition I'm looking for (actually I just type "cgrafp" into Think Reference :-)).

CGrafPort = 
RECORD
	device:					Integer;					{device ID for font selection}
	portPixMap:				PixMapHandle;					{handle to PixMap record}
	portVersion:				Integer;					{highest 2 bits always set}
	grafVars:					Handle;					{handle to a GrafVars record}
	chExtra:					Integer;	 				{added width for nonspace characters}
	pnLocHFrac:				Integer;					{pen fraction}
	portRect:					Rect;					{port rectangle}
	visRgn:					RgnHandle;					{visible region}
	clipRgn:					RgnHandle;					{clipping region}
	bkPixPat:					PixPatHandle;					{background pattern}
	rgbFgColor:				RGBColor;					{requested foreground color}
	rgbBkColor:				RGBColor;					{requested background color}
	pnLoc:						Point;					{pen location}
	pnSize:					Point;					{pen size}
	pnMode:					Integer;					{pattern mode}
	pnPixPat:					PixPatHandle;					{pen pattern}
	fillPixPat:				PixPatHandle;					{fill pattern}
	pnVis:						Integer; 					{pen visibility}
	txFont:					Integer;					{font number for text}
	txFace:					Style;					{text's font style}
	txMode:					Integer;					{source mode for text}
	txSize:					Integer;					{font size for text}
	spExtra:					Fixed;					{added width for space characters}
	fgColor:					LongInt;					{actual foreground color}
	bkColor:					LongInt;					{actual background color}
	colrBit:					Integer;					{plane being drawn}
	patStretch:				Integer;					{used internally}
	picSave:					Handle;					{picture being saved, used internally}
	rgnSave:					Handle;					{region being saved, used internally}
	polySave:					Handle;					{polygon being saved, used internally}
	grafProcs:					CQDProcsPtr;					{low-level drawing routines}
END;

So, we can see that at offset 16 in the cgrafport is the rectangle (portRect) that defines the current size of the CGrafPort. Thus we should pass the address of this rectangle to the EraseRect function. Now, it doesn't matter what size the window is, we will always erase the whole visible part of it (assuming you haven't messed up the current clipping rectangle, but we haven't come to that yet).
That may sound complicated, but the code is trivial:

	lwz	r3,my_window_ptr(`bss)	*our windowptr
	addi	r3,r3,16	*point to the portrect
	Xcall	EraseRect

See! This will erase the window. Here's a useful MacsBug tip in case you aren't aware of the power of MacsBug. It contains many templates for popular Mac data structures. So, if you have r3 pointing at a CGrafPort, in MacsBug you can type:

dm r3,cgrafport

And MacsBug will display the grafport with all it's field names and contents. "dm" is the MacsBug command to display memory. sm is the command the set memory.

Right, so now we can erase the window. All we need to do now is change the mouse button wait loop to branch back to the start of the drawing code rather than just waiting. Here is the program in total as it looks at this stage:

	includeh	windows.def
	includeh	quickdraw.def
	includeh	quickdrawtext.def
	includeh	events.def

bss:	reg	r30
chapter5:	entry
	start_up
	bl	init_mac
	li	r3,128	*the window ID in the resource fork
	li	r4,0	*Let the OS allocate storage for it.
	li	r5,-1	*We want it to the front
	Xcall	GetNewCWindow
	cmpwi	r3,0	
	stw	r3,my_window_ptr(`bss)	
	beq	error
	Xcall	SetPort
mouse_loop:
**lets set up our x and y coordinate variables
	li	r23,4	*x
	li	r24,20	*y
**lets set up our loop counter
	li	r25,10
**lets draw some text
draw_loop:
	mr	r3,r23
	mr	r4,r24
	Xcall	MoveTo
	lwz	r3,my_text(rtoc)
	Xcall	DrawString
	subic.	r25,r25,1
	addi	r23,r23,4
	addi	r24,r24,4
	bne	draw_loop
**Erase the window
	lwz	r3,my_window_ptr(`bss)
	addi	r3,r3,16	*point to the portrect
	Xcall	EraseRect	
	Xcall	Button		*Check the mouse button
	cmpwi	r3,0	
	beq		mouse_loop
error:	tidy_up	*end of program
*****
**Data
my_text:	pstring	"123"	*The text we print
	align
*****
**Linkage
	extern	init_mac	*This is a static library function

Hopefully, by now you are getting into the swing of calling the OS. I'm not going to dwell on it too much longer. You just have to learn the various functions available. Obviously in the rest of this series we will being seeing more OS functions, but I won't be detailing them too heavily. The Mac OS is big with a capital B. You just have to learn as you go.

NOTE: We used EraseRect to clear the window. We could have used PaintRect to achieve the same result. The difference is that EraseRect is quicker than PaintRect.

Colors

The Mac when drawing uses two colors: The forground color and the background color. EraseRect works with the background color. There are two OS calls to set these colors - RGBForeColor and RGBBackColor. These both take a pointer to three 16 bit unsigned values. Each 16 bit value specifies the red, green and blue intensity of the colour. So 0xffff,0xffff,0xffff is white, whilst 0x0,0x0,0x0 is black - red, green and blue. 0xffff,0,0 is max red, 0,0xffff,0 is max green.

Armed with this information we can start doing some funky color stuff. By altering the for and background colors we can alter the color of the text and the background of the window. Suppose we just incremented the value of the red component of the background color on each loop (and the other two components were set to zero). What would be the visual effect? Let's try it and see. We need to define some new data - the background color - we'll call it my_bg_color. Then we need some code after we erase the rectangle to change the background (bg) color:

**Change the bg color
	lwz	r3,my_bg_color(rtoc)
	
	lhz	r4,(r3)	*get red
	addi	r4,r4,220	*add 100 to it
	sth	r4,(r3)	*store red
	Xcall	RGBBackColor	r3

Of course, the red slowly fades up to maximum red and then swiches back to black. But what would be the result of the following code?

**Change the bg color
	lwz	r3,my_bg_color(rtoc)
	
	lhz	r4,(r3)	*get red
	addi	r4,r4,220	*add to it
	sth	r4,(r3)	*store red
	
	lhz	r4,2(r3)	*get green
	subi	r4,r4,280
	sth	r4,2(r3)	*Store green
	
	lhz	r4,4(r3)	*get blue
	subi	r4,r4,180
	sth	r4,4(r3)	*Store blue
	Xcall	RGBBackColor	r3

Believe me, you don't want to work through too many steps. The end result is almost a random color fade and switch affect - but of course it isn't random. Best tried on a monitor capable of millions of colors.

And that's it for this chapter. The above code snippet is pretty unoptimized as far as PPC code goes - can you optimize it? Answer next chapter (tip: what if the color component we are loading to modify isn't in the level 1 cache and hence isn't immediately available?). Another question: Can you change the colour changing code so that abrupt color changes do not occur - it's all nice and smooth? Final question: How could we change the background color without erasing (in 256 indexed color mode only) - Tip check in the examples folder of your Fantasm 5 CD?

Don't worry if you find these questions baffling - we'll cover them all next time along with more application goodies such as menus and events with the emphasis being on fun.

Till then,

Code On!

The project for this chapter can be downloaded from here (Fantasm 5 project) - 8k.

Reproduction in whole or part prohibited without permission.

Feedback

BACK