TSC: The Programmer's View - Part 2
Assembly Language Programming

 

Writing in machine language is rarely done today because the process is tedious and error prone. Imagine trying to write a program that automates balancing a checkbook using only machine language instructions. It would take weeks to write the thousands of instructions needed, some of which are destined to be incorrect. Finding an error within a million zeroes and ones is nearly impossible.

To circumvent this problem, early computer programmers created a language that is more user friendly than machine language but still provides access to a computer's individual instructions. They called it assembly language. Assembly language let's the programmer use an instruction's English mnemonic instead of it's opcode and allows binary immediate values to be replaced with their decimal equivalent. It also allows symbolic names, like LOOP or DONE, to be used in place of memory addresses and branch offsets thus freeing the programmer from having to remember where every piece of data is at in memory and from having to calculate branch and jump offsets. A program called an assembler translates the assembly instructions into binary machine code.
 

Some TSC Assembly Code

The program below shows the assembly code for a modified version of the power-of-two multiplication program we wrote in Part 1. We will use this code as an example throughout this section.

***************************************

* Multiply a value by a power of two. *

* The result is placed at ANSWER.     *

***************************************



* Code section



        .ORG 0

BEGIN   LAD $2, DATA

        LWD $0, $2, 0   ;The power value

        LWD $1, $2, 1   ;The number to be multiplied

LOOP    SHL $1, $1

        ADI $0, $0, -1

        BGZ $0, LOOP

        LAD $2, ANSWER

        SWD $1, $2, 0

        HLT



* Data section 



        .ORG 0X0100

DATA    .BSC 4, 3       ;Data for 16 x 3

ANSWER  .BSS 1



        .END BEGIN

        

Assembly Instruction Format

An assembly instruction follows this four part format:

          LABEL     MNEMONIC     OPERANDS     COMMENTS

Each part of an assembly instruction must be separated by a space or a tab or some combination of both. The label and comment parts are optional.

The Label

A label is a symbolic name denoting where the instruction is stored in memory. It can consist of letters or numbers but must start with a letter.

A label is used in one of three cases:

  1. The label location is the target of a branch or jump instruction. The LOOP label in the sample code is the target of BGZ. When a label is used in the offset field of a branch instruction, the assembler calculates the offset between the instruction following the branch and the label.
  2. The label location contains values that are loaded or stored. The location represented by the DATA label holds the data to be operated on and the location at ANSWER is where the result is stored.
  3. The label serves as a reference to the programmer. The BEGIN label in the sample code reminds the programmer that program execution should start at that instruction. The label is never actually referenced by any instructions in the program.

The Mnemonic and Operands

Every instruction requires a mnemonic and some operands (except for HLT - it has no operands). The mnemonic is an English-language replacement for an instruction's opcode. It is used because a word is usually easier to recognize than an opcode value.

The operands tell the instruction the location of the data to be operated on. Depending on the instruction, one to three operands are required. Operands are always separated from each other by commas.

There are three types of operands: register location, constant, and an expression that reduces to a constant. If the operand is a register location, the register number is preceded by a dollar sign (i.e. $0, $1, $2, $3). Constant operands are used in the immediate and target fields. They can be represented in either decimal or hex where hex numbers are preceded by '0x'.

A constant operand can initially be a two operand mathematical expression. The expression operands can be decimal or hex numbers or symbolic names. The available expression operators are +, -, *, and /. The expression is reduced by the assembler to a constant value before it is inserted into the instruction. Some examples of expression usage are

ADI $0, $0, A * 5

LAD ARRAY + 0x0009

JMP L - 1
Expressions can be used wherever a constant is used including in all the assembler directives. The one exception is in the branch instructions. Offset resolution is complicated by expressions so they are not allowed.

Comments

It is good programming practice to place comments at strategic places in your assembly code indicating why the code is there. Comments in TSC assembly are preceded by a semicolon or an asterisk. The asterisk can only be used if it's the first character on a line. Commenting your assembly programs does not affect the execution of the program in any way. When the assembler encounters a semicolon, it ignores the rest of the line. Likewise, if the assembler sees an asterisk at the beginning of a line, it discards the entire line. See the sample code for an example of using comments.
 

Pseudo-instructions

As assembly programmers became proficient, they noticed that some sequences of instructions were used over and over again throughout their code. Instead of having to repeatedly type in these sequences, they created pseudo-instructions. A pseudo-instruction is a one-line command that replaces a group of real instructions. Pseudo-instructions often look like real instructions but they are not supported by the hardware. When the assembler encounters a pseudo-instruction in a program file, it simply replaces it with the sequence of instructions it represents. Using pseudo-instructions not only makes writing a program a little easier, they allow the code to look less cluttered which makes the program easier to read.

In TSC, it is often desirable to load a 16-bit constant into a register. If the constant is not found in memory, it has to be created. Since there is no TSC immediate instruction that can load a value that large, the assembly language programmer has to use LHI followed by an ORI. For example, the only way to put the number 0x1289 into $3 is to execute

  LHI $3, 0x12

  ORI $3, $3, 0x89
The LHI-ORI instruction sequence is repeated so often that a Load Address (LAD) pseudo-instruction has been created to represent it. (Don't let the word "address" in the name confuse you. Although many 16-bit values used in TSC programs are the literal addresses of data in memory, the LAD pseudo-instruction can load in any 16-bit value. It doesn't matter if the value is being "used" as an address or not.) Using LAD, the sequence above is written as
  LAD $3, 0x1289
or if the label LOOP is equal to 0x1289, it can be written as
  LAD $3, LOOP

When the TSC Assembler sees a LAD instruction it takes it out of the code list, splits the constant operand into two parts, creates the LHI-ORI instruction pair, and places the pair where the LAD used to be. The sample code above demonstrates the use of LAD in a real program.

Currently, LAD is the only pseudo-instruction supported by the TSC Assembler.
 

Assembler Directives

Directives are commands within an assembly code file that "direct" the assembler to control the assembly process and reserve memory locations for program data. When assembled, the directives do not create executable instructions. In TSC, all directives are preceded by a period to distinguish them from instructions.

The TSC directives are:
 
       

ORG

Origin

Assembler Format

  .ORG address

Description

  The ORG directive provides the assembler with the memory address where the next instruction is to be placed. ORG is often on the first line of the program but is not required there (the TSC Assembler defaults to address zero if ORG is not found.) ORG is the means by which separate program segments are created.

The address operand can be a literal number (in hex or decimal), a symbolic name, or a mathematical expression. If the operand is a symbolic name (label), it must have been previously defined. ORG cannot have a label because it does not translate into an instruction or reserved memory location.


Examples

 
.ORG 0x56ff    ;Next instruction placed at 0x56ff

.ORG START     ;Next instruction placed at the value 

               ;of the label START

.ORG START+3   ;Next instruction placed at START+3
See the sample code for an example of how ORG is used to create segments.




       

END

Physical End of Code

Assembler Format

  .END operand

Description

  The END directive marks the physical end of the program. All code after END is discarded. All TSC programs must have an END directive.

END cannot have a label because it does not translate into an instruction or reserved memory location. The operand is optional and is ignored by the assembler.


Examples

 
.END         ;Pretty simple

.END BEGIN   ;The operand indicates where the 

             ;program began




       

BSS

Block Storage - Space

Assembler Format

  .BSS operand

Description

  The BSS directive reserves blocks of memory for data storage. The operand indicates the number of memory locations to be reserved. The TSC Assembler initializes all locations in the block to 0x0000.

A label is optional but BSS is generally useless without one. The operand can be a literal number (in hex or decimal), a symbolic name, or a mathematical expression although expressions and symbolic names usually are not useful in the context of BSS. If the operand is a symbolic name (label), it must have been previously defined. The operand must be positive.


Examples

 
A    .BSS 10      ;The A block has 10 words

B    .BSS 0x0011  ;The B block has 17 words
See the sample code for an example of how the locations reserved by BSS are accessed by other instructions.




       

BSC

Block Storage - Constants

Assembler Format

  .BSC operand list

Description

  The BSC directive reserves blocks of memory for data storage and initializes the locations in the block to the values in the comma-delimited operand list (comma-delimited means the elements in a list are separated by commas.) The number of elements in the operand list determines the number of memory locations to be reserved.

A label is optional but BSC is generally useless without one. The elements in the operand list can be literal numbers (in hex or decimal), symbolic names, or mathematical expressions. If the operand is a symbolic name (label), it must have been previously defined. The elements can be positive or negative.


Examples

 
NEG2 .BSC -2   ;-2 stored at NEG2

X    .BSC 0x0005, 10, 0x000f, -5

               ;The X block has 4 words initialized

               ;to the values in the list

Y    .BSC X+1, X+2   ;The value X+1 is stored at Y

               ;The value X+2 is stored at Y+1
See the sample code for an example of how the values defined by BSS are accessed by other instructions.




       

EQU

Equate

Assembler Format

  label .EQU operand

Description

  The EQU directive assigns the operand value to the label. The value can represent a memory location or a data constant.

The operand can be a literal number (in hex or decimal), a symbolic name, or a mathematical expression. If the operand is a symbolic name (label), it must have been previously defined. The value of the operand can be positive or negative.

EQU is not needed to write TSC programs but is provided as a convenience to the advanced assembly programmer.


Examples

 
* EQU used to duplicate label values

X     OR $3, $2, $0

Y     .EQU X           ;Y=X



* EQU used to create constants

A     .EQU 56          ;A=56

B     .EQU A+4         ;B=60

      ADI $2, $2, B-A  ;Add 4 to $2



* EQU used to create indexing constants

ESIZE .EQU 4    ;Size of each element in array

INDEX .EQU 7    ;Index of an element in the array

       ...      ;Some code here

      LWD $1, $3, INDEX * ESIZE    

                ;$3 contains the address of the first

                ;element in the array.  4*7 is added 

                ;to that address to get the element

                ;at index 7


[DISECT THE SAMPLE PROGRAM HERE]



Updated on 8 May 2000.   Send comments to cs143@pel.cs.byu.edu

© 1999, 2000, Performance Evaluation Laboratory, Brigham Young University. Reproduction of all or part of this work is permitted for non-profit educational or research use provided this copyright notice remains intact. All other rights reserved.