tpasm Manual (October 24, 2002) tpasm began as a replacement for mpasm (an assembler for Microchip's PIC processors). Then it got out of control. Now it is a cross assembler for a variety of common microprocessors (including the PICs). It was written because mpasm only runs on one platform -- one which I have difficulty using. tpasm's feature set and syntax is a conglomeration of features from many other assemblers. It bears enough similarity to mpasm so that porting mpasm source to it should not be very painful. tpasm Features: ----- -------- - true multi-pass assembly (will take as many passes as needed) - multiple segments - sophisticated expressions - macros, repeats, conditionals - arbitrary length labels - local labels - supporting new processors is reasonably straightforward - can switch between processors during assembly - tpasm is free software Command Line: ------- ---- Usage: tpasm [opts] sourceFile [opts] Assembly Options: -o type fileName Output 'type' data to fileName Multiple -o options are allowed, all will be processed No output is generated by default -I dir Append dir to the list of directories searched by include -d label value Define a label to the given value -P processor Choose initial processor to assemble for -n passes Set maximum number of passes (default = 32) -l listName Create listing to listName -w Do not report warnings -p Print diagnostic messages to stderr Information Options: -show_procs Dump the supported processor list -show_types Dump the output file types list Options are case sensitive and are detailed below: -o Selects the type and file name for tpasm's output. Multiple -o options can be specified to generate multiple output files. The current list of available output types is: intel Intel format segment dump srec Motorola S-Record format segment dump (16 bit) srec32 Motorola S-Record format segment dump (32 bit) sunplus Sunplus format symbol listing text Textual symbol file listing As you can see from the above, -o is used to specify output hex files as well as symbol files. Any combination is allowed. NOTE: if no -o options are specified, tpasm will assemble, but produce no output. -I Adds a directory that the INCLUDE pseudo-op will search looking for include files which are given in angle brackets. INCLUDE // search for it INCLUDE "includeFile" // dont search for it -d Defines a label which is used in the assembly just as a label created by the EQU directive. -P Tells tpasm which processor it should use by default. This is not required, as it can also be specified in the source with the PROCESSOR directive. -n Is used under unusual circumstances to alter the number of passes that tpasm will make before deciding that the source code cannot be resolved. Normally this need not be specified. -l Selects the name of the file where tpasm will generate a listing. If no listing file is specified, tpasm will not generate a listing. -w Will cause tpasm to omit reporting warnings. Warnings will also be suppressed in the listing output if one is generated. -p Is used mainly to see what the internals of the assembler are doing as a debugging aid. If your source code is not resolving in a reasonable number of passes, this can be used to help pinpoint the labels which are not resolving. -show_procs Causes tpasm to dump the list of supported processors. If this option is present on the command line, tpasm will not attempt to assemble anything. -show_types Causes tpasm to dump the list of supported output file types. If this option is present on the command line, tpasm will not attempt to assemble anything. Assembly Syntax: -------- ------ Lines of assembly source files all have a similar syntax. Namely: [label] [opcode [operands]] [;comments] The comment field can be introduced either with a ';' or a '//'. Comments may appear on lines by themselves, and blank lines are allowed. Labels are case sensitive, and may be of arbitrary length. Label definitions must begin in column 0. Opcodes (including macros) are always matched in a case insensitive way, and must be preceded by white space. Operands must be separated from opcodes by white space. Comments do not need to be preceded by white space. Some pseudo-ops do not allow labels. Labels: ------ tpasm labels consist of one of the characters from the set: A-Z, a-z, or _ followed by any number of characters from the set: A-Z, a-z, _, or 0-9 Local labels are preceded by a '.', or '@'. Local labels are in scope between non-local labels, or the edges of of macros that contain them. Pseudo-Ops: ------ --- tpasm supports a standard set of pseudo-ops (ones which are available no matter which processor is selected), and a supplementary set -- based on the chosen processor. Pseudo-ops are case insensitive. The standard set consists of: INCLUDE "fileNameString" INCLUDE Include another source file into the assembly. The inclusion of another source file will not cause local labels to go out of scope, so it is possible to reference a local label across include files. NOTE: placing the file name in <>'s will cause the assembler to search the include path for them. NOTE: includes are nestable to 256 levels deep. This restriction is meant to keep self-including source trees from causing the assembler's stack to overflow. SEG "segmentNameString" SEGU "segmentNameString" The SEG pseudo-op creates or sets the current segment. If segmentNameString was previously created, the assembler just sets the segment back to it. If it was not previously created, the assembler creates it and sets the segment to it. Newly created segments have a default origin of 0. A segment is nothing more than an addressed area which the assembler knows about. When code is generated by the assembler, it is placed into the current segment. Each segment has an 'origin' which tells the current place data is to be written into it. When the assembler switches between segments, it keeps track of the origin of each. Segments may be 'initialized' (the SEG pseudo-op) in which case data written to the segment is placed into the output file, or 'uninitialized' (the SEGU pseudo-op) in which case the data written to the segment is discarded. Uninitialized segments can be useful for (among other things) assigning RAM locations to labels. The segmentNameString is case sensitive. At the moment (I may change this later) the assembler automatically creates a segment called "code" and sets its origin to 0 upon execution. An arbitrary number of segments may be created. ORG exp This sets the origin for the currently selected segment. The origin is the location where generated code will be placed within the segment. RORG exp This sets the origin for code generation of the current segment. This origin tells the assembler that code which is being generated should be made to _appear_ to start at the location given by exp. This is useful if you need to generate code which is meant to be copied before it is executed. The generated code is still placed into the segment at locations based on the last ORG statement. ALIGN exp Align the ORG-based origin to the current or next address which satisfies: (address mod exp == 0). NOTE: if exp is unresolved, or evaluates to 0, or 1, this does nothing. label EQU exp Assign label to the result of exp. Once a label has been EQU'd, it may not be reassigned to a different value. label SET exp Similar to EQU, except that you may use SET to redefine label to other values at any time. label UNSET Remove label from the assembler's name space as if it had never been SET. label ALIAS replacement Assign replacement to label. Operates similarly to EQU except replacement is not an expression -- it is a simple text substitution. Labels defined with ALIAS are only expanded when they appear as operands to instructions (they are not expanded in the arguments to pseudo-ops). Unlike EQU, an ALIAS must be declared before it is used. Since the expansion takes place in the text domain before any meaning is applied to the operands by the instruction, be careful to use unique enough labels so that tpasm does not replace unexpected strings within your code. Label must only consist of characters which are valid for labels. If replacement is not quoted, it too must contain only characters which are valid for labels. However, if replacement is placed in double quotes, it may contain any character. ALIAS replacements are not recursive. label UNALIAS remove a label that was defined with ALIAS label MACRO param1,param2,... Begin the definition of an assembler macro. Label becomes the name of the macro being defined. When the macro is invoked, the string param1 is replaced with the first macro argument, param2, with the second, etc.... There can be an arbitrary number of parameters. Example: test MACRO var1,var2 ADD var1,var2 ENDM The opcode "test" test A,$14 then expands as: ADD A,$14 NOTE: when a macro is expanded, a new local label scope is created so that macros can contain local labels which do not interfere with the code surrounding the invocation. Also, macros are recursive. It is possible to invoke, or even define another macro from within a macro. ENDM Marks the end of a macro definition. IF exp Used for conditional assembly. Code between the IF and the first matching ELSE or ENDIF will be interpreted by the assembler if exp is resolved, and non-zero. NOTE: if exp is not resolved, neither the code following the IF nor any code given by an associated ELSE will be interpreted. IFDEF label Used for conditional assembly. Code between the IFDEF and ELSE or ENDIF will be interpreted by the assembler if 'label' is defined. NOTE: as soon as a label is defined on any pass of the assembly (even if it is not resolved), subsequent IFDEF invocations for that label (in this, and subsequent passes) will see it as defined. IFNDEF label Used for conditional assembly. Code between the IFDEF and ELSE or ENDIF will be interpreted by the assembler if 'label' is NOT defined. NOTE: as soon as a label is defined on any pass of the assembly (even if it is not resolved), subsequent IFNDEF invocations for that label (in this, and subsequent passes) will see it as defined. ELSE Used for conditional assembly. Code between the ELSE and ENDIF will be interpreted by the assembler if the preceding IF, IFDEF, or IFNDEF evaluated to FALSE. ENDIF Marks the end of a conditional assembly block. NOTE: all conditionals are nestable to any level. SWITCH exp Used for conditional assembly. exp is evaluated, and then compared to each of the following CASEs. If exp is resolved, and matches a given CASE statement, code between the CASE and either a BREAK, or ENDS is interpreted by the assembler. CASE exp Used between the SWITCH and ENDS pseudo-ops, exp is evaluated and compared with the result of the evaluation of the expression given in the SWITCH. If both expressions are resolved, and evaluate to the same result, the code after the CASE up until a BREAK or ENDS is interpreted by the assembler. BREAK Ends any CASE that preceded it. NOTE: it is possible to have multiple CASEs before a break: SWITCH value CASE 1 CASE 2 MESSAGE "value was 1 or 2" BREAK CASE 3 MESSAGE "value was 3" CASE 4 MESSAGE "value was 3 or 4" ENDS ENDS Marks the end of a SWITCH. REPEAT exp Used to duplicate code. REPEAT evaluates exp and interprets the code between the REPEAT and ENDR pseudo-ops exp number of times (including 0). If exp is not resolved, the code after REPEAT is ignored. An example (lifted from the dasm manual): Y SET 0 REPEAT 10 X SET 0 REPEAT 10 DB X,Y X SET X+1 ENDR Y SET Y+1 ENDR generates an output table: 0,0 1,0 2,0 ... 9,0 0,1 1,1 2,1 ... 9,1, etc... ENDR Marks the end of a REPEAT. ERROR "message" When interpreted by the assembler, causes "message" to be printed out as if an error had occurred in the assembly. WARNING "message" When interpreted by the assembler, causes "message" to be printed out as if a warning had occurred in the assembly. MESSG "message" MESSAGE "message" Causes the assembler to print "message" to the console. This message is only printed during the final pass of assembly. LIST If listing has been enabled by the -l command line option, this will enable listing output. (See NOLIST below). NOLIST If listing has been enabled by the -l command line option, this will disable listing output. This is useful for the contents of include files which you do not want to appear in the listing output. EXPAND Allows macros and repeats to be expanded into the listing output NOEXPAND Prohibits macros and repeats from generating listing output during expansion. PROCESSOR processorString Tells the assembler to start assembling for the given processor. This does not change the current segment or the origin. There can be an arbitrary number of PROCESSOR pseudo-ops in the source being assembled. END Tells the assembler to stop assembling the current file, or macro. If END is seen in an include file, it stops assembly of that file only (assembly resumes after the point that the file was included). If END is seen during macro expansion, it stops expansion of the macro. Expressions: ----------- tpasm evaluates all expressions using 32 bit quantities. The following operators are available: Unary operators: .strlen. Returns the length of the string which follows it. for example: .strlen."this" evaluates to 4. high Returns the high byte of the low word of the expression following it. low Returns the low byte of the low word of the expression following it. msw Returns the high word of the expression following it. lsw Returns the low word of the expression following it. ! Logical not. ~ Bitwise not. - Negation. Unary operators always have the highest precedence. Binary operators in descending order of precedence: (blank lines separate precedence groups) * Multiplication. / Division. % Modulus. + Addition. - Subtraction. << Left shift. >> Right shift. < Less than. > Greater than. <= Less than or equal to. >= Greater than or equal to. == Tests for equality. != Tests for inequality. & Bitwise and. ^ Bitwise xor. | Bitwise or. && Logical and. || Logical or. Grouping: ( Begin group. ) End group. Constants: --------- #### interpreted in base 10 0b### binary 0o### octal 0d### decimal 0x### hex A'cccc' ascii (1 to 4 characters) B'###' binary O'###' octal D'###' decimal H'###' hex ###b binary ###B binary ###o octal ###O octal ###d decimal ###D decimal ###h hex (first digit must be 0-9) ###H hex (first digit must be 0-9) %### binary .### decimal $### hex 'cccc' ascii (1 to 4 characters) "string" string (not zero terminated) Symbols: ------- $ Current program counter (as of the beginning of the instruction) including the effects of RORG. This is the symbol normally used to get the current PC. $$ Current program counter (as of the beginning of the instruction) not including the effects of RORG. This is used if you need to know the actual PC from within an RORG'd segment. It can also be used to cancel the effects of RORG by issuing "RORG $$". This works, since the relative origin will now become the absolute origin. If you don't use RORG, you'll never need $$. Additional Pseudo-ops for various processors: ---------- ---------- --- ------- ---------- PIC Family --- ------ DB val[,val,val...,val] Define byte (This description lifted from the MPASM manual.) Reserve program memory words with packed 8-bit values. Multiple expressions continue to fill bytes consecutively until the end of expressions. Should there be an odd number of expressions, the last byte will be 0. DW val[,val,val...,val] Define word Reserve program memory words with 16-bit values. DATA Synonym for DW. DT val[,val,val...,val] Define table Generates a series of RETLW instructions, one for each value. Each value must be 8 bits in size. Each character in a string is stored in its own RETLW instruction. DS val Define space Move the PC forward by val. __CONFIG val Set processor configuration bits. __IDLOCS val Sets the four ID locations to the hexadecimal digits of val. __MAXRAM val Define the absolute maximum valid RAM address. __BADRAM val[-val][,val[-val]...] Set locations which are not valid RAM. BANKSEL BANKISEL PAGESEL