Burroughs B6x00-7x00 instruction set

The Burroughs B6x00-7x00 instruction set includes the set of valid operations for the Burroughs B6500,^[1] B7500 and later Burroughs large systems, including the current (as of 2006) Unisys Clearpath/MCP systems; it does not include the instruction for other Burroughs large systems including the B5000, B5500, B5700 and the B8500. These unique machines have a distinctive design and instruction set. Each word of data is associated with a type, and the effect of an operation on that word can depend on the type. Further, the machines are stack^[a] based to the point that they had no user-addressable registers.

Overview

As you would expect from the unique architecture used in these systems, they also have an interesting instruction set. Programs are made up of 8-bit syllables, which may be Name Call, be Value Call or form an operator, which may be from one to twelve syllables in length. There are less than 200 operators, all of which fit into 8-bit syllables. If we ignore the powerful string scanning, transfer, and edit operators, the basic set is only about 120 operators. If we remove the operators reserved for the operating system such as MVST and HALT, the set of operators commonly used by user-level programs is less than 100. The Name Call and Value Call syllables contain address couples; the Operator syllables either use no addresses or use control words and descriptors on the stack.

Since there are no programmer-addressable registers, most of the register manipulating operations required in other architectures are not needed, nor are variants for performing operations between pairs of registers, since all operations are applied to the top of the stack. This also makes code files very compact, since operators are zero-address and do not need to include the address of registers or memory locations in the code stream. Some of the code density was due to moving vital operand information elsewhere, to 'tags' on every data word or into tables of pointers. Many of the operators are generic or polymorphic depending on the kind of data being acted on as given by the tag. The generic opcodes required fewer opcode bits but made the hardware more like an interpreter, with less opportunity to pipeline the common cases.

For example, the instruction set has only one ADD operator. It had to fetch the operand to discover whether this was an integer add or floating point add. Typical architectures require multiple operators for each data type, for example add.i, add.f, add.d, add.l for integer, float, double, and long data types. The architecture only distinguishes single and double precision numbers – integers are just reals with a zero exponent. When one or both of the operands has a tag of 2, a double precision add is performed, otherwise tag 0 indicates single precision. Thus the tag itself is the equivalent of the operator .i, .f, .d, and .l extension. This also means that the code and data can never be mismatched.

Two operators are important in the handling of on-stack data – Value Call (VALC) and Name Call (NAMC). These are two-bit operators, 00 being VALC and 01 being NAMC. The following six bits of the syllable, concatenated with the following syllable, provide the address couple. Thus VALC covers syllable values 0000 to 3FFF and NAMC 4000 to 7FFF.

VALC is another polymorphic operator. If it hits a data word, that word is loaded to the top of stack. If it hits an IRW, that is followed, possibly in a chain of IRWs until a data word is found. If a PCW is found, then a function is entered to compute the value and the VALC does not complete until the function returns.

NAMC simply loads the address couple onto the top of the stack as an IRW (with the tag automatically set to 1).

Static branches (BRUN, BRFL, and BRTR) used two additional syllables of offset. Thus arithmetic operations occupied one syllable, addressing operations (NAMC and VALC) occupied two, branches three, and long literals (LT48) five. As a result, code was much denser (had better entropy) than a conventional RISC architecture in which each operation occupies four bytes. Better code density meant fewer instruction cache misses and hence better performance running large-scale code.

In the following operator explanations remember that A and B are the top two stack registers. Double precision extensions are provided by the X and Y registers; thus the top two double precision operands are given by AX and BY. (Mostly AX and BY is implied by just A and B.)

B6x00/7x00 Address Couple
Current LL	Lexical Level bits	Index bits
0-1	13	12-0
2-3	13-12	11-0
4-7	13-11	10-0
8-15	13-10	9-0
16-31	13-9	8-0

Tags and control words

In the B6500, a word has 48 bits of data and three tag bits. extended to three bits outside of the 48 bit word into a tag. The data bits are bits 0–47 and the tag is in bits 48–50. Bit 48 is the read-only bit, thus odd tags indicated control words that cannot be written by a user-level program. Code words are given tag 3. Here is a list of the tags and their function:

Tag	Word kind	Description
0	Data	All kinds of user and system data (text data and single precision numbers)
2	Double	Double Precision data
4	SIW	Step Index word (used in loops)
6		Uninitialized data
6	SCW	Software Control Word (used to cut back the stack)
1	IRW	Indirect Reference Word
1	SIRW	Stuffed Indirect Reference Word
3	Code	Program code word
	MSCW	Mark Stack Control Word
	RCW	Return Control Word
	TOSCW	Top of Stack Control Word
	SD	Segment Descriptor
5	Descriptor	Data block descriptors
7	PCW	Program Control Word

The current incarnation of these machines, the Unisys ClearPath has extended tags further into a four bit tag. The microcode level that specified four bit tags was referred to as level Gamma.

Even-tagged words are user data which can be modified by a user program as user state. Odd-tagged words are created and used directly by the hardware and represent a program's execution state. Since these words are created and consumed by specific instructions or the hardware, the exact format of these words can change between hardware implementation and user programs do not need to be recompiled, since the same code stream will produce the same results, even though system word format may have changed.

Tag 1 words represent on-stack data addresses. The normal IRW simply stores an address couple to data on the current stack. The SIRW references data on any stack by including a stack number in the address.

Tag 5 words are descriptors, which are more fully described in the next section. Tag 5 words represent off-stack data addresses.

Tag 7 is the program control word which describes a procedure entry point. When operators hit a PCW, the procedure is entered. The ENTR operator explicitly enters a procedure (non-value-returning routine). Functions (value-returning routines) are implicitly entered by operators such as value call (VALC). Global routines are stored in the D[2] environment as SIRWs that point to a PCW stored in the code segment dictionary in the D[1] environment. The D[1] environment is not stored on the current stack because it can be referenced by all processes sharing this code. Thus code is reentrant and shared.

Tag 3 represents code words themselves, which won't occur on the stack. Tag 3 is also used for the stack control words MSCW, RCW, TOSCW.

Figure 9.2 From the ACM Monograph in the References. *Elliot Organick 1973.*

Display registers

A stack hardware optimization is the provision of D (or "display") registers. These are registers that point to the start of each called stack frame. These registers are updated automatically as procedures are entered and exited and are not accessible by any software. There are 32 D registers, which is what limits to 32 levels of lexical nesting.

Consider how we would access a lexical level 2 (D[2]) global variable from lexical level 5 (D[5]). Suppose the variable is 6 words away from the base of lexical level 2. It is thus represented by the address couple (2, 6). If we don't have D registers, we have to look at the control word at the base of the D[5] frame, which points to the frame containing the D[4] environment. We then look at the control word at the base of this environment to find the D[3] environment, and continue in this fashion until we have followed all the links back to the required lexical level. This is not the same path as the return path back through the procedures which have been called in order to get to this point. (The architecture keeps both the data stack and the call stack in the same structure, but uses control words to tell them apart.)

As you can see, this is quite inefficient just to access a variable. With D registers, the D[2] register points at the base of the lexical level 2 environment, and all we need to do to generate the address of the variable is to add its offset from the stack frame base to the frame base address in the D register. (There is an efficient linked list search operator LLLU, which could search the stack in the above fashion, but the D register approach is still going to be faster.) With D registers, access to entities in outer and global environments is just as efficient as local variable access.

D Tag Data                — Address couple, Comments
register

| 0        | n          | (4, 1) The integer n (declared on entry to a block, not a procedure)
|-----------------------|
| D[4]==>3 | MSCW       | (4, 0) The Mark Stack Control Word containing the link to D[3].
|=======================|
| 0        | r2         | (3, 5) The real r2
|-----------------------|
| 0        | r1         | (3, 4) The real r1
|-----------------------|
| 1        | p2         | (3, 3) A SIRW reference to g at (2,6)
|-----------------------|
| 0        | p1         | (3, 2) The parameter p1 from value of f 
|-----------------------|
| 3        | RCW        | (3, 1) A return control word
|-----------------------|
| D[3]==>3 | MSCW       | (3, 0) The Mark Stack Control Word containing the link to D[2].
|=======================|
| 1        | a          | (2, 7) The array a  ======>[ten word memory block]
|-----------------------|
| 0        | g          | (2, 6) The real g 
|-----------------------|
| 0        | f          | (2, 5) The real f 
|-----------------------|
| 0        | k          | (2, 4) The integer k 
|-----------------------|
| 0        | j          | (2, 3) The integer j 
|-----------------------|
| 0        | i          | (2, 2) The integer i
|-----------------------|
| 3        | RCW        | (2, 1) A return control word
|-----------------------|
| D[2]==>3 | MSCW       | (2, 0) The Mark Stack Control Word containing the link to the previous stack frame.
|=======================| — Stack bottom

If we had invoked the procedure p as a coroutine, or a process instruction, the D[3] environment would have become a separate D[3]-based stack. This means that asynchronous processes still have access to the D[2] environment as implied in ALGOL program code. Taking this one step further, a totally different program could call another program's code, creating a D[3] stack frame pointing to another process' D[2] environment on top of its own process stack. At an instant the whole address space from the code's execution environment changes, making the D[2] environment on the own process stack not directly addressable and instead make the D[2] environment in another process stack directly addressable. This is how library calls are implemented. At such a cross-stack call, the calling code and called code could even originate from programs written in different source languages and be compiled by different compilers.

The D[1] and D[0] environments do not occur in the current process's stack. The D[1] environment is the code segment dictionary, which is shared by all processes running the same code. The D[0] environment represents entities exported by the operating system.

Stack frames actually don't even have to exist in a process stack. This feature was used early on for file I/O optimization, the FIB (file information block) was linked into the display registers at D[1] during I/O operations. In the early nineties, this ability was implemented as a language feature as STRUCTURE BLOCKs and – combined with library technology - as CONNECTION BLOCKs. The ability to link a data structure into the display register address scope implemented object orientation. Thus, the B6500 actually used a form of object orientation long before the term was ever used.

On other systems, the compiler might build its symbol table in a similar manner, but eventually the storage requirements would be collated and the machine code would be written to use flat memory addresses of 16-bits or 32-bits or even 64-bits. These addresses might contain anything so that a write to the wrong address could damage anything. Instead, the two-part address scheme was implemented by the hardware. At each lexical level, variables were placed at displacements up from the base of the level's stack, typically occupying one word - double precision or complex variables would occupy two. Arrays were not stored in this area, only a one word descriptor for the array was. Thus, at each lexical level the total storage requirement was not great: dozens, hundreds or a few thousand in extreme cases, certainly not a count requiring 32-bits or more. And indeed, this was reflected in the form of the VALC instruction (value call) that loaded an operand onto the stack. This op-code was two bits long and the rest of the byte's bits were concatenated with the following byte to give a fourteen-bit addressing field. The code being executed would be at some lexical level, say six: this meant that only lexical levels zero to six were valid, and so just three bits were needed to specify the lexical level desired. The address part of the VALC operation thus reserved just three bits for that purpose, with the remainder being available for referring to entities at that and lower levels. A deeply nested procedure (thus at a high lexical level) would have fewer bits available to identify entities: for level sixteen upwards five bits would be needed to specify the choice of levels 0–31 thus leaving nine bits to identify no more than the first 512 entities of any lexical level. This is much more compact than addressing entities by their literal memory address in a 32-bit addressing space. Further, only the VALC opcode loaded data: opcodes for ADD, MULT and so forth did no addressing, working entirely on the top elements of the stack.

Much more important is that this method meant that many errors available to systems employing flat addressing could not occur because they were simply unspeakable even at the machine code level. A task had no way to corrupt memory in use by another task, because it had no way to develop its address. Offsets from a specified D-register would be checked by the hardware against the stack frame bound: rogue values would be trapped. Similarly, within a task, an array descriptor contained information on the array's bounds, and so any indexing operation was checked by the hardware: put another way, each array formed its own address space. In any case, the tagging of all memory words provided a second level of protection: a misdirected assignment of a value could only go to a data-holding location, not to one holding a pointer or an array descriptor, etc. and certainly not to a location holding machine code.

Arithmetic operators

ADD: Add top two stack operands (B := B + A or BY := BY + AX if double precision)
SUBT: Subtract (B - A)
MULT: Multiply with single or double precision result
MULX: Extended multiply with forced double precision result
DIVD: Divide with real result
IDIV: Divide with integer result
RDIV: Return remainder after division
NTIA: Integerize truncated
NTGR: Integerize rounded
NTGD: Integerize rounded with double precision result
CHSN: Change sign
JOIN: Join two singles to form a double
SPLT: Split a double to form two singles
ICVD: Input convert destructive – convert BCD number to binary (for COBOL)
ICVU: Input convert update – convert BCD number to binary (for COBOL)
SNGL: Set to single precision rounded
SNGT: Set to single precision truncated
XTND: Set to double precision
PACD: Pack destructive
PACU: Pack update
USND: Unpack signed destructive
USNU: Unpack signed update
UABD: Unpack absolute destructive
UABU: Unpack, absolute update
SXSN: Set external sign
ROFF: Read and clear overflow flip flop
RTFF: Read true/false flip flop

Comparison operators

LESS: Is B < A?
GREQ: Is B >= A?
GRTR: Is B > A?
LSEQ: Is B <= A?
EQUL: Is B = A?
NEQL: Is B <> A?
SAME: Does B have the same bit pattern as A, including the tag

Logical operators

LAND: Logical bitwise and of all bits in operands
LOR: Logical bitwise or of all bits in operands
LNOT: Logical bitwise complement of all bits in operand
LEQV: Logical bitwise equivalence of all bits in operands

Branch and call operators

BRUN: Branch unconditional (offset given by following code syllables)
DBUN: Dynamic branch unconditional (offset given in top of stack)
BRFL: Branch if last result false (offset given by following code syllables)
DBFL: Dynamic branch if last result false (offset given in top of stack)
BRTR: Branch if last result true (offset given by following code syllables)
DBTR: Dynamic branch if last result true (offset given in top of stack)
EXIT: Exit current environment (terminate process)
STBR: Step and branch (used in loops; operand must be SIW)
ENTR: Execute a procedure call as given by a tag 7 PCW, resulting in an RCW at D[n] + 1
RETN: Return from current routine to place given by RCW at D[n] + 1 and remove the stack frame

Bit and field operators

BSET: Bit set (bit number given by syllable following instruction)
DBST: Dynamic bit set (bit number given by contents of B)
BRST: Bit reset (bit number given by syllable following instruction)
DBRS: Dynamic bit reset (bit number given by contents of B)
ISOL: Field isolate (field given in syllables following instruction)
DISO: Dynamic field isolate (field given in top of stack words)
FLTR: Field transfer (field given in syllables following instruction)
DFTR: Dynamic field transfer (field given in top of stack words)
INSR: Field insert (field given in syllables following instruction)
DINS: Dynamic field insert (field given in top of stack words)
CBON: Count binary ones in the top of stack word (A or AX)
SCLF: Scale left
DSLF: Dynamic scale left
SCRT: Scale right
DSRT: Dynamic scale right
SCRS: Scale right save
DSRS: Dynamic scale right save
SCRF: Scale right final
DSRF: Dynamic scale right final
SCRR: Scale right round
DSRR: Dynamic scale right round

Literal operators

LT48: Load following code word onto top of stack
LT16: Set top of stack to following 16 bits in code stream
LT8: Set top of stack to following code syllable
ZERO: Shortcut for LT48 0
ONE: Shortcut for LT48 1

Descriptor operators

INDX: Index create a pointer (copy descriptor) from a base (MOM) descriptor
NXLN: Index and load name (resulting in an indexed descriptor)
NXLV: Index and load value (resulting in a data value)
EVAL: Evaluate descriptor (follow address chain until data word or another descriptor found)

Stack operators

PUSH: Push down stack register
DLET: Pop top of stack
EXCH: Exchange top two words of stack
RSUP: Rotate stack up (top three words)
RSDN: Rotate stack down (top three words)
DUPL: Duplicate top of stack
MKST: Mark stack (build a new stack frame resulting in an MSCW on the top,

— followed by NAMC to load the PCW, then parameter pushes as needed, then ENTR)

IMKS: Insert an MSCW in the B register.
VALC: Fetch a value onto the stack as described above
NAMC: Place an address couple (IRW stack address) onto the stack as described above
STFF: Convert an IRW as placed by NAMC into an SIRW which references data in another stack.
MVST: Move to stack (process switch only done in one place in the MCP)

Store operators

STOD: Store destructive (if the target word has an odd tag throw a memory protect interrupt,

— store the value in the B register at the memory addressed by the A register. — Delete the value off the stack.

STON: Store non-destructive (Same as STOD but value is not deleted – handy for F := G := H := J expressions).
OVRD: Overwrite destructive, STOD ignoring read-only bit (for use in MCP only)
OVRN: Overwrite non-destructive, STON ignoring read-only bit (for use in MCP only)

Load operators

The Load instruction could find itself tripping on an indirect address, or worse, a disguised call to a call-by-name thunk routine.

LOAD: Load the value given by the address (tag 5 or tag 1 word) on the top of stack.

— Follow an address chain if necessary.

LODT: Load transparent – load the word referenced by the address on the top of stack

Transfer operators

These were used for string transfers usually until a certain character was detected in the source string. All these operators are protected from buffer overflows by being limited by the bounds in the descriptors.

TWFD: Transfer while false, destructive (forget pointer)
TWFU: Transfer while false, update (leave pointer at end of transfer for further transfers)
TWTD: Transfer while true, destructive
TWTU: Transfer while true, update
TWSD: Transfer words, destructive
TWSU: Transfer words, update
TWOD: Transfer words, overwrite destructive
TWOU: Transfer words, overwrite update
TRNS: Translate – transfer a source buffer into a destination converting characters as given in a translate table.
TLSD: Transfer while less, destructive
TLSU: Transfer while less, update
TGED: Transfer while greater or equal, destructive
TGEU: Transfer while greater or equal, update
TGTD: Transfer while greater, destructive
TGTU: Transfer while greater, update
TLED: Transfer while less or equal, destructive
TLEU: Transfer while less or equal, update
TEQD: Transfer while equal, destructive
TEQU: Transfer while equal, update
TNED: Transfer while not equal, destructive
TNEU: Transfer while not equal, update
TUND: Transfer unconditional, destructive
TUNU: Transfer unconditional, update

Scan operators

These were used for scanning strings useful in writing compilers. All these operators are protected from buffer overflows by being limited by the bounds in the descriptors.

SWFD: Scan while false, destructive
SISO: String isolate
SWTD: Scan while true, destructive
SWTU: Scan while true, update
SLSD: Scan while less, destructive
SLSU: Scan while less, update
SGED: Scan while greater or equal, destructive
SGEU: Scan while greater or equal, update
SGTD: Scan while greater, destructive
SGTU: Scan while greater, update
SLED: Scan while less or equal, destructive
SLEU: Scan while less or equal, update
SEQD: Scan while equal, destructive
SEQU: Scan while equal, update
SNED: Scan while not equal, destructive
SNEU: Scan while not equal, update

CLSD: Compare characters less, destructive
CLSU: Compare characters less, update
CGED: Compare characters greater or equal, destructive
CGEU: Compare characters greater or equal, update
CGTD: Compare character greater, destructive
CGTU: Compare character greater, update
CLED: Compare characters less or equal, destructive
CLEU: Compare characters less or equal, update
CEQD: Compare character equal, destructive
CEQU: Compare character equal, update
CNED: Compare characters not equal, destructive
CNEU: Compare characters not equal, update

System

SINT: Set interval timer
EEXI: Enable external interrupts
DEXI: Disable external interrupts
SCNI: Scan in – initiate IO read, this changed on different architectures
SCNO: Scan out – initiate IO write, this changed on different architectures
STAG: Set tag (not allowed in user-level processes)
RTAG: Read tag
IRWL: Hardware pseudo operator
SPRR: Set processor register (highly implementation dependent, only used in lower levels of MCP)
RPRR: Read processor register (highly implementation dependent, only used in lower levels of MCP)
MPCW: Make PCW
HALT: Halt the processor (operator requested or some unrecoverable condition has occurred)

Other

VARI: Escape to extended (variable instructions which were less frequent)
OCRX: Occurs index builds an occurs index word used in loops
LLLU: Linked list lookup – Follow a chain of linked words until a certain condition is met
SRCH: Masked search for equal – Similar to LLLU, but testing a mask in the examined words for an equal value
TEED: Table enter edit destructive
TEEU: Table enter edit, update
EXSD: Execute single micro destructive
EXSU: Execute single micro update
EXPU: Execute single micro, single pointer update
NOOP: No operation
NVLD: Invalid operator (hex code FF)
User operators: unassigned operators could cause interrupts into the operating system so that algorithms could be written to provide the required functionality

Edit operators

These were special operators for sophisticated string manipulation, particularly for business applications.

MINS: Move with insert – insert characters in a string
MFLT: Move with float
SFSC: Skip forward source character
SRSC: Skip reverse source characters
RSTF: Reset float
ENDF: End float
MVNU: Move numeric unconditional
MCHR: Move characters
INOP: Insert overpunch
INSG: Insert sign
SFDC: Skip forward destination character
SRDC: Skip reverse destination characters
INSU: Insert unconditional
INSC: Insert conditional
ENDE: End edit

Notes

^ The lexical level in a syllable may refer either to a marked point in the stack of the current task or to a marked point in the stack of a parent task. The term the stack may refer to multiple related stacks, collectively known as a saguaro stack.

References

^ Burroughs (September 1969), Burroughs B6500 Information Processing System Reference Manual (PDF), 1043676

[2] The lexical level in a syllable may refer either to a marked point in the stack of the current task or to a marked point in the stack of a parent task. The term the stack may refer to multiple related stacks, collectively known as a saguaro stack.

[1] Burroughs (September 1969), Burroughs B6500 Information Processing System Reference Manual (PDF), 1043676

[1]

[a]