The M-code Instruction Set
Select headings to return to index
==========================
Robert Smith June 1988
POPC is a user program that extends the normal POP-11 compiler to form SYSPOP-11, the systems programming language of POPLOG. These extensions allow such facilities as definition of structures, manipulation of pointers and machine integers, etc, and the language starts to bear a strong similarity to 'C'. When compiling a system source file, calls are made to the VM interface in the same manner as for user programs, but rather than optimisation occuring as the file is processed, an unadulterated (but slightly modified) code-list of VM instructions for each procedure is passed to POPC. POPC optimises this code stream and translates it to an intermediate representation called M-code (this corresponds to a multiple-operand machine instruction set with generalised addressing modes). For each M-code there is a procedure responsible for translating that M-code into the equivalent target machine assembler.
This note describes the instructions and operands of M-code. A later document will decribe other aspects of the system code generation process such as register declaration and use, target code emission, inline code expansions, etc. The M-code to target assembler translation routines are located in $popsrc/syscomp/genproc.p for a given machine.
The following descriptions are not completely machine independant. There is an implicit assumption that a 32-bit machine is used. The notes which follow some instructions assume byte-addressability (the case for all current POPLOG hosts) which allow the following tagging scheme:
A pointer to an object (all objects word aligned, thus same as object address) +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 30-bit word index |0|0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
POP integer +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 30-bit signed integer |1|1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
POP decimal +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 30-bit decimal |0|1| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The M-code instruction set has a total of 47 instructions. I have divided them into 7 major classes, based in some cases on physical rather than logical connections. The instruction groups are:
Data Movement Instructions:
M_MOVE M_MOVEs M_MOVEb M_MOVEbit M_MOVEss M_MOVEsb M_MOVEsbit M_UPDs M_UPDb M_UPDbit
Arithmetic Instructions:
M_ADD M_SUB M_MULT M_NEG M_PADD M_PSUB M_PADD_TEST M_PSUB_TEST M_PTR_ADD_OFFS M_PTR_SUB_OFFS M_PTR_SUB
Logical and Shift Instructions:
M_BIS M_BIC M_BIM M_LOGCOM M_ASH
Compare and Test Instructions:
M_BIT M_CMP M_TEST M_PCMP M_PTR_CMP M_CMPKEY
Branch Instructions:
M_BRANCH M_BRANCH_std M_BRANCH_ON M_BRANCH_ON_INT
Procedure Call and Stack Frame Instructions:
M_CALL M_CALLSUB M_CALL_WITH_RETURN M_RETURN M_CHAIN M_CHAINSUB M_CREATE_SF M_UNWIND_SF
Miscellaneous:
M_LABEL M_ERASE M_END
The remainder of this section describes the abstract syntax and operation of each of the instructions. The syntax is abstract because each instruction really appears as a vector whose first element is a (pointer to) an M-code transalation routine. The descriptions are not complete because some (e.g. stack frame instructions) use data from variables rather than arguments. Addressing modes and test conditions are dealt with in the next section.
M_MOVE Move Word
Syntax: M_MOVE src dest
Description: Move the contents of -src- to -dest-.
Operation: dest:int = src:int
M_MOVEs Move Unsigned Short
Syntax: M_MOVEs src dest
Description: Move the unsigned short at -src- to word at -dest-.
Operation: dest:<15:0> = src:short dest:<31:16> = 0
M_MOVEb Move Unsigned Byte
Syntax: M_MOVEb src dest
Description: Move the unsigned byte at -src- to word at -dest-.
Operation: dest:<7:0> = src:byte dest:<31:8> = 0
M_MOVEbit Move Unsigned Bit Field
Syntax: M_MOVEbit size pos base dest
Description: Move the unsigned bit field in word -base- starting at bit -pos- and extending up for -size- bits to the word at -dest-.
Operation: dest:<size-1:0> = base:<size+pos-1:pos> dest:<31:size> = 0
M_MOVEss Move Signed Short
Syntax: M_MOVEss src dest
Description: Move the signed short at -src- to word at -dest-.
Operation: dest:<15:0> = src:short dest:<31:16> = src:<15>
M_MOVEsb Move Signed Byte
Syntax: M_MOVEsb src dest
Description: Move the signed byte at -src- to word at -dest-.
Operation: dest:<7:0> = src:byte dest:<31:8> = src:<7>
M_MOVEsbit Move Signed Bit Field
Syntax: M_MOVEsbit size pos base dest
Description: Move the signed bit field in word -base- starting at bit -pos- and extending up for -size- bits to the word at -dest-.
Operation: dest:<size-1:0> = base:<size+pos-1:pos> dest:<31:size> = base:<size+pos-1>
M_UPDs Update Short
Syntax: M_UPDs src dest
Description: Move the least significant short at -src- to word at -dest-.
Operation: dest:<15:0> = src:<15:0> dest:<31:16> = unaffected
M_UPDb Update Byte
Syntax: M_UPDb src dest
Description: Move the least significant byte at -src- to word at -dest-.
Operation: dest:<7:0> = src:<7:0> dest:<31:8> = unaffected
M_UPDbit Update Bit Field
Syntax: M_UPDbit size pos base src
Description: Move the -size- least significant bits from the word at -src- to the bit field in -base- starting at bit position -pos- and extending up for -size- bits.
Operation: base:<size+pos-1:pos> = src:<size-1:0> base:<pos-1:0> = unaffected base:<31:size+pos> = unaffected
M_ADD Add Machine Integers
Syntax: M_ADD src1 src2 dest
Description: Add machine integer contents of -src1- to machine integer contents of -src2- and put machine integer result in -dest-.
Operation: dest:int = src2:int + src1:int
M_SUB Subtract Machine Integers
Syntax: M_SUB src1 src2 dest
Description: Subtract machine integer contents of -src1- from machine integer contents of -src2- and put machine integer result in -dest-.
Operation: dest:int = src2:int - src1:int
M_MULT Multiply Machine Integers
Syntax: M_MULT src1 src2 dest
Description: Multiply machine integer contents of -src2- by machine integer contents of -src1- and put machine integer result in -dest-.
Operation: dest:int = src2:int * src1:int
M_NEG Negate Machine Integer
Syntax: M_NEG src dest
Description: Negate machine integer contents of -src- and put machine integer result in -dest-.
Operation: dest:int = 0:int - src:int
M_PADD Add POP Integers
Syntax: M_PADD src1 src2 dest
Description: Add POP integer contents of -src1- to POP integer contents of -src2- and put POP integer result in -dest-.
Operation: dest:pint = src2:pint + src1:pint
Notes: With normal POP integer representation and machine arithmetic: dest = src2 + (src1 - 0x3)
M_PSUB Subtract POP Integers
Syntax: M_PSUB src1 src2 dest
Description: Subtract POP integer contents of -src1- from POP integer contents of -src1- and put POP integer result in -dest-.
Operation: dest:pint = src2:pint - src1:pint
Notes: With normal POP integer representation and machine arithmetic: dest = src2 - (src1 - 0x3)
M_PADD_TEST Add POP Integers With Test
Syntax: M_PADD_TEST src1 src2 cond label
Description: Add POP integer contents of -src1- to POP integer contents of -src2- and push the POP integer result on the stack. If the -cond- is true then branch to the -label- else continue.
Operation: push (src2:pint + src1:pint) on user stack if cond then PC = label
Notes: Calculation as for M_PADD. In practice -cond- is always an overflow test.
M_PSUB_TEST Subtract POP Integers With Test
Syntax: M_PSUB_TEST src1 src2 cond label
Description: Subtract POP integer contents of -src2- from POP integer contents of -src1- and push the POP integer result on the stack. If the -cond- is true then branch to the -label- else continue.
Operation: push (src2:pint - src1:pint) on user stack if cond then PC = label
Notes: Calculation as for M_PSUB. In practice -cond- is always an overflow test.
M_PTR_ADD_OFFS Add Pointer Offset
Syntax: M_PTR_ADD_OFFS type off base dest
Description: Add offset -off- to pointer in -base- to form pointer at -dest-. Pointers and offsets of type -type-.
Operation: dest:ptr = base:ptr + off:offs
Notes: As machine arithmetic for byte-addressable machines. -type- is irrelevant.
M_PTR_SUB_OFFS Subtract Pointer Offset
Syntax: M_PTR_SUB_OFFS type off base dest
Description: Subtract offset -off- from pointer in -base- to form pointer at -dest-. Pointers and offsets of type -type-.
Operation: dest:ptr = base:ptr - off:offs
Notes: As machine arithmetic for byte-addressable machines. -type- is irrelevant.
M_PTR_SUB Subtract Pointer Offset
Syntax: M_PTR_SUB type ptr1 ptr2 dest
Description: Subtract pointer -ptr1- from pointer -ptr2- to form offset in -dest-. Pointers and offsets of type -type-.
Operation: dest:offs = ptr2:ptr - ptr1:ptr
Notes: As machine arithmetic for byte-addressable machines. -type- is irrelevant.
M_BIS Bit Set
Syntax: M_BIS src1 src2 dest
Description: Set bits in -src2- that are are set in -src1- and put the result in -dest-.
Operation: dest:int = src2:int || src1:int
M_BIC Bit Clear
Syntax: M_BIC src1 src2 dest
Description: Clear bits in -src2- that are are set in -src1- and put the result in -dest-.
Operation: dest:int = src2:int && ~~ src1:int
M_BIM Bit Mask
Syntax: M_BIM src1 src2 dest
Description: Clear bits in -src2- that are are clear in -src1- and put the result in -dest-.
Operation: dest:int = src2:int && src1:int
M_LOGCOM Complement
Syntax: M_LOGCOM src dest
Description: Put complement of -src- in -dest-.
Operation: dest:int = ~~ src:int
M_ASH Shift Arithmetic
Syntax: M_ASH count src dest
Description: Perform arithmetic shift of -count- bits on -src- and put result in -dest-. A positive -count- gives a shift to the left. Zeroes are shifted in from the right, and the sign bit from the left.
Operation: dest:int = src:int << count:int (arithmetic shift)
M_BIT Bit Test
Syntax: M_BIT mask src cond label
Description: If logical AND of -mask- and -src- such that -cond- is true then jump to the -label-, else continue.
Operation: src:int && mask:int (sets condition codes) if cond then PC = label
M_TEST Test Machine Integer
Syntax: M_TEST src cond label
Description: If -src- compared with zero gives -cond- true then jump to the -label-, else continue.
Operation: src:int - 0:int (sets condition codes) if cond then PC = label
M_CMP Compare Machine Integers
Syntax: M_CMP src1 src2 cond label
Description: Compare machine integers -src1- and -src2-. If -cond- is true then jump to -label-, else continue.
Operation: src2:int - src1:int (sets condition codes) if cond then PC = label
M_PCMP Compare POP Integers
Syntax: M_PCMP src1 src2 cond label
Description: Compare POP integers -src1- and -src2-. If -cond- is true then jump to -label-, else continue.
Operation: src2:pint - src1:pint (sets condition codes) if cond then PC = label
Notes: As machine integer compare for current implementations.
M_PTR_CMP Compare Pointers
Syntax: M_PTR_CMP type src1 src2 cond label
Description: Compare pointers -src1- and -src2-. If -cond- is true then jump to -label-, else continue. The pointers are of type -type-.
Operation: src2:ptr - src1:ptr (sets condition codes) if cond then PC = label
Notes: As machine integer compare for current implementations.
M_CMPKEY Compare Key
Syntax: M_CMPKEY key src cond label
Description: Compare the key -key- with the keyfield of the object -src-. If -cond- is true then jump to -label-, else continue. If the object is simple then the key will not match
Operation: if issimple(src) then se condition codes 'not equal' else key(src):key - key:key (set condition codes) endif if cond then PC = label
Notes: Only EQ and NEQ conditions are sensible.
M_BRANCH Branch
Syntax: M_BRANCH label
Description: Transfer control to code at -label-.
Operation: PC = label
M_BRANCH_std Standard Branch
Syntax: M_BRANCH_std label
Description: Transfer control to code at -label-. Equivalent with M_BRANCH at the M-code level, but guaranteed to generate target branch code of fixed size. Used in procedure code to standardize seperation between two code entry points.
Operation: PC = label
M_BRANCH_ON Branch On POP Integer
Syntax: M_BRANCH_ON switch label_list else_label
Description: Transfer control to one of the labels in -label_list- given by the value of the POP integer -switch-, where a value of 1 implies the first label. If the -switch- is out of range then jump to -else_label- (if false then continue).
Operation: if switch >= 1 and switch <= length(label_list) then goto label_list(switch) elseif else_label then goto else_label endif
M_BRANCH_ON_INT Branch On Machine Integer
Syntax: M_BRANCH_ON_INT switch label_list else_label
Description: Transfer control to one of the labels in -label_list- given by the value of the machine integer -switch-, where a value of 1 implies the first label. If the -switch- is out of range then jump to -else_label- (if false then continue).
Operation: if switch >= 1 and switch <= length(label-list) then goto label-list(switch) elseif else-label then goto else-label endif
M_CALL Call POP Procedure
Syntax: M_CALL call
Description: Call execute address of POP procedure -call-.
Operation: push return PC on call stack PC = call
M_CALLSUB Call Assembler Routine
Syntax: M_CALLSUB call
Description: Call assembler routine with entry address -call-.
Operation: push return PC on call stack PC = call
M_CALL_WITH_RETURN Call POP Procedure With Given Return Address
Syntax: M_CALL_WITH_RETURN call return
Description: Push supplied return address -return- and chain execute address of POP procedure -call-.
Operation: push return address on call stack PC = call
M_CALLER_RETURN Access/Update Caller's Return Address
Syntax: M_CALLER_RETURN update operand
Description: If -update- is false, move the return address into the caller of the current procedure to destination -operand-. if -update- is true, set the caller's return address to the value from source -operand-.
Operation: operand = caller's return address (update false) caller's return address = operand (udpate true)
Notes: Optional: if not defined, caller's return address is assumed to be in an ordinary memory location in the current stack frame. (Currently used only in the SPARC implementation, where caller's return address is held in a register, and is offset by -8 from the actual return address.)
M_RETURN Return
Syntax: M_RETURN
Description: Return from POP procedure.
Operation: Pop PC from call stack
M_CHAIN Chain POP Procedure
Syntax: M_CHAIN chain
Description: Chain execute address of POP procedure -chain-.
Operation: PC = chain
M_CHAINSUB Chain Assembler Routine
Syntax: M_CHAINSUB chain
Description: Chain assembler routine with entry address -chain-.
Operation: PC = chain
M_CREATE_SF Create Stack Frame
Syntax: M_CREATE_SF
Description: Create stack frame for POP procedure on call stack.
Operation: save machine registers save dynamic locals allocate and zero locals on stack allocate space for non pop variables save owner pointer
Notes: Instruction uses information from variables rather than arguments.
M_UNWIND_SF Unwind Stack Frame
Syntax: M_UNWIND_SF
Description: Unwind stack frame
Operation: remove owner pointer and stack variables restore dynamic locals restore machine registers
Notes: Instruction uses information from variables rather than arguments.
M_LABEL Label
Syntax: M_LABEL label
Description: Define label -label- of next M-code instruction.
Operation: Define assembler label
M_ERASE Erase
Syntax: M_ERASE dest
Description: Erase one element from stack specified by auto-indexed operand -dest-.
Operation: If -dest- is auto-indexed operand then modify index by normal offset.
M_END End
Syntax: M_END
Description: End of M-code instruction stream.
Operation: None
The representation of addressing modes in M-code instructions has already been described by Simon Nichols. The following table is adapted from his document.
M-code operand type VAX Addressing mode Example translation ------------------- ------------------- -------------------
Word Register "R1" --> R1
Integer Immediate 4 $4
Ref <ref 'label'> Immediate <ref 'foo'> $foo
String Absolute 'label' label
Vector {^reg 0} Register Deferred {r1 0) (r1)
{^reg ^disp} Displacement {r1 4} 4(r1)
{^reg ^false} Autodecrement {r1 false} -(r1)
{^reg ^true} Autoincrement {r1 true} (r1)+
{^reg ^disp ^reg'} Based Indexed with {a1 4 d2} a1@(4,d2:L) Displacement (68K)
Pair
[^operand'|^disp] Autoincrement deferred [{r1 0}|0] @0(r1) or displacement deferred (effective address is value of operand plus displacement
The conditions referred to in the compare instructions will be one of the following:
EQ equal to NEQ not equal to LT signed less than LEQ signed less than or equal to GT signed greater than GEQ signed greater than or equal to ULT unsigned less than ULEQ unsigned less than or equal to UGT unsigned greater than UGEQ unsigned greater than or equal to NEG negative POS positive OVF overflow NOVF not overflow
The pointer types that are generated for some M-code's are:
As we have seen, for byte addressable machines these can be ignored.
The operation description of the M-code's made a rather cavalier use of types. They are listed here with the intention that one day they may become more formal:
int 32-bit integer short 16-bit integer byte 8-bit integer <n:m> Bit field from bit n down to bit m in 32-bit integer pint POP integer ptr Pointer offs Offset key Key