DOC POPLOG Porting Guide

- 1 -

Select headings to return to index

. . General Considerations
. . Changes to the Source Code
. . I/O in POPLOG
. . Other O/S Utilities
. . External Procedures

**** N.B. THIS IS OUT OF DATE IN A NUMBER OF RESPECTS ****

POPLOG Porting Guide ====================

                            John Gibson
              School of Cognitive and Computing Sciences,
						University of Sussex

Contents --------

Compiling and Building the POPLOG Core
Machine-Specific Tasks
1. General Considerations 2.1.1 Representation of POPLOG Items 2.1.2 Memory Resources 2.1.3 Procedures 2.1.4 Control Stack Frames 2.1.5 The User Stack 2.1.6 Register Allocation in General
2. Changes to the Source Code 2.2.1 Hand-coded Assembler Files 2.2.2 POPC Code Generation 2.2.3 Run-time Code Generation
Operating System-Specific Tasks
1. I/O in POPLOG
2. Other O/S Utilities
3. External Procedures

This document is a guide to porting the Sussex University POPLOG system to new machine/operating system environments. It should be read in conjunction with the POPLOG files REF VMCODE and REF SYSPOP11.

     While describing the architectural features of the system that  are
generally  relevant  to  the  porting enterprise, it is not a manual and
therefore does not (necessarily)  address  issues  in  detail;  it  will
certainly  not  cover  many contingencies encountered in doing an actual
port. It is a regrettable fact that most modern  machines  are  designed
around  the  needs  of  languages  like FORTRAN, PASCAL, C, etc, and not
around those of systems like POPLOG; for this reason, porting the system
may  have  something of the flavour of fitting a square peg into a round
hole.

     Essentially, the work involved splits into  machine-specific  tasks
and  operating system-specific tasks; a port to a given host may involve
either or both of these, depending on whether an implementation  already
exists  for  one or the other. The table below subdivides the activities
and gives approximate timescales for the components parts:

John Gibson July 86

- 2 -

    Machine-Specific                Man-months
    ----------------                ----------
        Hand-Coded Assembler     :  ?
        POPC Code Generation     :  ?
        Run-Time Code Generation :  ?

    O/S-Specific                    Man-months
    ------------                    ----------
        I/O Interface            :  ?
        Other O/S Utilities      :  ?
        External Procedures      :  ?

John Gibson July 86

- 3 -

1. Compiling and Building the POPLOG Core -----------------------------------------

The core POPLOG system (that is, excluding libraries, PROLOG and Common LISP sources) comprises roughly 140 source files, sub-divided as follows:

Type Extension Files Lines ---- --------- ------ ----- Assembler source .s 10 2000 C source .c 1 300 sysPOP-11 source .p 326 60000 sysPOP-11 header .ph 7 2000

The system is built by compiling the Assembler, C and sysPOP-11 sources into object files, which are then linked to produce the base executable binary image. For the Assembler and C source, compilation is done with the host system's assembler and C compiler. Compilation of the sysPOP-11 sources (which 'include' the sysPOP-11 header files) is more complicated, and we shall describe this in detail.

     All POPLOG compilers, including POP-11, compile  source  code  into
POPLOG Virtual Machine (VM) instructions; in the normal run-time system,
the resulting list of VM code is further compiled into host machine-code
inside  procedure  records  (which  are  what  actually  get  executed).
However, the VM has a facility  whereby  the  instruction  list  can  be
diverted  from  its  normal route, and passed instead to a special user-
defined procedure, which can then process it in any desired way.

     This facility is  used by the  system compiler POPC  to compile  VM
code into symbolic  assembler (the  principal reason  for providing  the
facility). POPC  takes  a  system  source file,  compiles  it  with  the
ordinary POPLOG POP-11  compiler, and then  translates the resulting  VM
instruction list into a file of  assembler (which can then be  assembled
into object format in the normal way).

     In fact,  POPC  does  rather  more  than  this:  by  allowing  some
additional POP-11  syntax constructs,  and by  recognising and  trapping
references to certain  special identifier names,  it defines the  "sys-"
extended dialect of POP-11. The sysPOP-11 dialect enables system  source
code to perform  operations that  are not possible  in standard  POP-11,
such as manipulating raw machine-level values and pointers (in much  the
same way as a  C program can, for  example). POPC produces in-line  code
for many of these  operations, and does a  large amount of  optimisation
(see REF SYSPOP11).

     In addition to outputting a  file of symbolic assembler  (extension
'.a'), POPC also produces a POPLOG 'symbol table' file (extension  '.w')
that records  the usage  of  permanent variables  and constants  in  the
source being compiled. After all sysPOP-11 sources have been  processed,
the set  of '.w'  files is  input to  a linking  program POPLINK,  which
outputs (in several more assembler files) all necessary definitions  for
identifier cells and word  records used, together  with the POPLOG  word
dictionary.

John Gibson July 86

- 4 -

     The compilation process for sysPOP-11 files is  summarised  in  the
diagram below:
          ------------------
         < sysPOP-11 Source >
          ------------------
                  |
                 \|/
          ------------------

POP-11 Compiler| ------------------
VM Code | ------------------
POPC | ------- ------------------ | |
- | |
- \|/ \|/
- ------------- -------------
- < Symbol File > < Symbol File >
- ------------- -------------
- | |
- -------------------
- |
- \|/
- ------------
- | POPLINK |
- ------------
- | \|/ \|/ ----------- ----------- < Assembler > < Assembler > ----------- -----------

     Note that this process  requires a POPLOG system  to run both  POPC
and POPLINK (both of which are  ordinary POP-11 programs). Thus any  new
port necessitates an initial cross-compilation stage, where the  sources
are compiled on a system already running POPLOG, and the complete set of
assembler files  transferred  to the  target  machine for  assembly  and
linking. Once the target POPLOG has been made to work, it can be used to
compile its own source files.

     [N.B.  POPLOG  uses  a  large  number  of  global  linker  symbols,
currently  5000-6000;  a  recurring  problem  in past ports has been the
inability of linkers to cope with this many. It is  advisable  to  check
the capabilities of the target system linker at an early stage.]

The source files for the POPC and POPLINK programs reside in the "syscomp" subdirectory of the source directory; these are standard POP- 11 (extension '.p'). Of the 13 or so files, upto 4 may need changing for a particular implementation (see 'POPC Code Generation' below). In particular, the file 'sysdefs.p' is specific to each POPLOG system, and defines various constants of the implementation.

John Gibson July 86

- 5 -

2. Machine-Specific Tasks -------------------------

In porting POPLOG to a new CPU, there are a number of design decisions that must be made at the outset, in particular the representation of POPLOG items and the allocation of machine registers. While the source code for the system is designed to be as portable as possible (and that portability is being improved all the time), the complexities of the system and the need for efficient execution make total generality virtually impossible. Thus certain assumptions have to made regarding e.g., the layout of procedures and data in memory, the execution protocol for procedures, control stack format, and so on.

General Considerations

2.1.1 Representation of POPLOG Items

POPLOG manipulates all data in the form of fixed-length items, whose size should correspond to the 'natural' wordlength of the target machine (e.g. 32 bits). An item must be self-identifying, and must be capable of representing both simple objects (implicit data, i.e. integers and single-length floating point), and compound objects (pointers to data structures in memory), with the additional requirement that compound items are directly useable as machine addresses. Thus the encoding scheme for items is restricted to mapping pointers to themselves, but not integers or single-length floating point.

     For example, in all current implementations (all of  which  are  on
32-bit byte-addressable machines), encoding is achieved with one tag bit
to distinguish  simple  from  compound,  and  a  second  to  distinguish
integers  from  single  floats.  Since  all  POPLOG  data structures are
rounded to an exact  number  of  32-bit  words,  and  begin  on  a  word
boundary, the address of any structure is a multiple of 4, having 2 zero
bits at the bottom; bit 0 is then used for the first tag (0 =  compound,
1 = simple), and bit 1 for the second (0 = single float, 1 = integer).

            pointer            single float         integer
        -----------------    ----------------   ----------------
        |           |0|0|    |          |0|1|   |          |1|1|
        -----------------    ----------------   ----------------

     This does mean, of course, that two bits are lost from  the  normal
machine  representation of integers and single floats. A machine integer
M is mapped to a POPLOG integer P by the relation P =  4M+3,  while  the
mapping for a single float depends on the machine and its floating-point
format (essentially, the single float has to be massaged into  a  30-bit
format  with  the  2 least significant bits lost from the mantissa). See
the sections below for  a  discussion  of  arithmetic  on  integers  and
floats.

     This encoding scheme would still  be  valid  on  a  machine  having
addresses  in multiples of 16-bit units, where addresses of 32-bit words
are only a multiple of 2; bit 0 can  still  be  used  to  distinguish  a
simple object from a pointer.  On the other hand, it would not be viable

John Gibson July 86

- 6 -

on a machine that addressed 32-bit words in 32-bit units, where word addresses would go up in ones. Assuming that addresses on such a machine were positive as machine integers, an alternative scheme would be to have simple items encoded as negative integers.

     As yet, POPLOG has not been implemented on a machine that  provides
direct  support  for  tagging.  Making full use of a tagged architecture
would probably require some changes to the source code.

2.1.2 Memory Resources

POPLOG assumes two main areas of contiguous memory, sub-divided as follows:

            Area A              Area B
            ------              ------
            heap                control stack
            user stack          PROLOG trail
                                PROLOG continuation stack

In Area A, the heap is space for creating general data structures, while the user stack is used for passing procedure arguments and results. The user stack is at the top of the area (higher memory addresses) growing downwards, while the heap is at the bottom, growing upwards; the area is assumed to be extensible (not automatically, but by explicit operating system call), so that more space can be allotted to either part when necessary.

-------------- - high address (extends this way)

user stack |
- - - - - |
|
- - - - - |
|
heap |
| -------------- - low address

Overflow detection in this area happens in two ways. First, the heap allocator always checks that sufficient room is available below the user stack when allocating space for a record. Second, the source code supports a default system of maintaining a buffer area between the two parts, which allows the user stack some leeway in overflowing; checks for user stack overflow into the buffer area are then performed at appropriate places (such as on entry to, or on backward jumps in, user- defined procedures, and in system procedures that push a large number of the things on the stack). However, where the host machine/operating system supports it, the user stack overflow checks can be eliminated by creating an inaccessible memory page between the user stack and the heap, and dealing with overflow as a memory access violation. (Currently, this technique is used only in the VAX VMS implementation of POPLOG.)

     The other important consideration in regard to Area A is user stack
underflow.   Because  explicity-coded  checks  for  this  would  degrade
running efficiency by too great an amount, it is simply assumed that the
area  (and  therefore the user stack) is followed by inaccessible memory

John Gibson July 86

- 7 -

in such a way that any attempt to access a non-existent item on the stack will cause an access violation. This can be achieved either by aligning the end of the area at the boundary of the memory space allocated by the host system, or (as above, if the host permits it) by creating an inaccessible page there.

     [For example, in Unix implementations, Area A is  the  normal  data
segment,  extended by means of the brk system call. An attempt to access
over the end results in a segmentation violation (signal 11), which  the
POPLOG  signal  handler then interprets as user stack underflow. A small
wrinkle in this scheme is that  the  brk  call  usually  rounds  up  the
requested  quantity  of  memory  to  be  a multiple of the page size (or
memory mangagement segment size, etc), and unless this is accounted  for
the  user stack will not actually reside at the end of available memory.
The relevant part of the POPLOG source allows for the  insertion  of  an
implementation-dependent code fragment to handle this.]

     We turn now to Area B, containing the  control  stack  for  holding
procedure  execution  stack frames, and two stacks for PROLOG, the trail
and the continuation stack.  In this coexistence regime, the  trail  and
the  continuation stack grow towards each other, while the control stack
grows away from both of them, as shown:

----------------------

continuation stack |
- - - - - - - - - -|
|
- - - - - - - - - -|
trail | |--------------------|
|
control stack |
|
- - - - - - - - - -|
| (extends this way)

It is assumed that the area as a whole will correspond to the normal execution stack space of the host system (i.e. where subroutine jump instructions push return addresses, etc), and will be extended automatically by the operating system when necessary; to allow compatibility with the conventional direction of growth, the area may be situated either way up in memory (i.e. with high addresses at the top of the diagram and low at the bottom, or vice-versa).

     This apart, all stack mangagement in  the  area  is  controlled  by
explicit  checks  in the source code, which allows for the control stack
and trail to be shifted together in the growth direction to create  more
room for the PROLOG stacks.

2.1.3 Procedures All code in POPLOG is packaged up inside procedure records which, aside from being 'executable', have exactly the status as all other data structures in the system (i.e. they are 'first-class' objects). In addition to the executable code it contains, a procedure has a header which maintains various associated data (such as its pdprops and updater fields, details of its stack frame layout, etc), and a literal table

John Gibson July 86

- 8 -

containing constant values used in the code. This said, we must however distinguish two (potentially) different classes: system procedures and user procedures.

     One reason for this distinction is that POPLOG performs incremental
compilation, source code from user  programs being compiled through  the
VM to produce procedure  records in the heap  at run-time (see  Run-time
Code Generation  below).  Like all  other  heap objects,  procedures  so
manufactured must  be relocatable  by the  garbage collector,  and  this
means that all code  in such a  procedure must be  position-independent,
and must reference  other relocatable  structures only  via the  literal
table  (which  the  garbage  collector  knows  about,  and  can   update
appropriately). The  constraint does  not,  however, apply  to  built-in
system procedures  generated  by  POPC,  because  they  never  reference
relocatable heap structures directly.

     Another reason concerns the nature of the executable code  in  user
procedures:  it is assumed that system procedures will always use native
machine code, and, from the efficiency point of view, this is  the  most
desirable  choice for user procedures also. But this may not be possible
if, for example, the machine  enforces  separate  instruction  and  data
spaces, and in this case an interpreted code must be used instead.

     [Although one (now redundant) POPLOG implementation (for the  Zilog
Z8000)  used  this  method,  there  is currently no standard for such an
interpreted code because all  current  POPLOG  implementations  generate
machine code.]

     In all current systems, procedures are laid out  shown  below;  the
length of the literal table varies between different procedures, and the
(fixed-length) header contains a pointer to the start of the code:

----------------------

header --|----- (pointer to code start)
information | | |--------------------| |
literal table | | |--------------------| |
|--<--
executable |
code |
| ----------------------

For system procedures containing machine code, a potential problem with this scheme is that some host systems may not allow data (i.e. the header and literal table) to be in the same memory space as the code (either because the machine enforces separate I/D spaces, or because the operating system will not allow data in a shareable code segment). An alternative scheme is to have the the header in data space pointing to a separate code segment, but this will require that the execution protocol for a procedure loads the literal table address into a known register, etc, before calling the code.

John Gibson July 86

- 9 -

2.1.4 Control Stack Frames It is assumed that the machine supports an execution stack protocol for function/procedure argument passing, local variables and return addresses, and that it maintains a stack pointer register for this purpose. POPLOG cannot use this protocol, but must instead use the stack pointer to create control stack frames according to its own standard. There are several good reasons for this: first, POPLOG procedures support dynamic binding of local variables, and the host protocol will not normally cater for this; second, POPLOG procedures pass arguments on the user stack, and will not therefore require the apparatus for passing them on the control stack that the host protocol makes available. The most important reason, however, is that various parts of POPLOG (e.g. the garbage collector, abnormal procedure exit mechanisms, etc) work with stack frames as explicit data structures, which must therefore be in a standard format. The format is shown in the diagram below: if the stack grows downwards high addresses are at the top and low at the bottom, and the other way round if it grows upwards.

--------------------------------

saved non-pop registers | |------------------------------|
saved pop registers | |------------------------------|
saved pop dynamic locals | |------------------------------|
saved non-pop dynamic locals | |------------------------------|
pop on-stack vars | |------------------------------|
non-pop on-stack vars | |------------------------------|
owner procedure address | |------------------------------|
return address | --------------------------------

The return address and owner procedure fields are single-word, while each of the others is an arbitrary number of words long; the lengths of these parts (plus the overall length of the frame) can be determined from information held in the owner procedure header. Note the distinction here between 'pop' and 'non-pop' variable values: 'pop' values are cells containing proper POPLOG objects (encoded as discussed above), and are processed by the garbage collector; 'non-pop' values on the other hand are things which the garbage collector should leave alone, and may contain arbitrary machine integers, offsets, addresses, etc. (Non-pop variables are one of the facilities provided by the sysPOP-11 dialect.)

     Because the length of a stack frame for a particular  procedure  is
fixed,  and  is  available from the header information of the procedure,
their is no need for the 'frame pointer' mechanism used by many standard
calling  protocols  to  access the frame base and to chain each frame to
its predecessor. This may mean that a dedicated frame  pointer  register
can be used for other purposes.

John Gibson July 86

- 10 -

     Stack frames are created by the prologue code in POPLOG procedures,
this code itself being generated by POPC (for system procedures) and the
Run-time assembler (for user procedures).  If for any reason the  target
machine does not permit  the (efficient) creation of  a stack frames  in
the required format, additional modifications to the source code may  be
necessary.

2.1.5 The User Stack

The control stack and the user stack are the two most important stacks in POPLOG; while the host system will almost certainly provide directly for the former, it is unlikely to do so for the latter. Since fast pushing and popping of user stack items is critical to the efficiency of a POPLOG implementation, it is essential for the user stack pointer to be held in a global register or other fast storage location (and preferably one for which the machine supports push/pop instructions or autoincrement/autodecrement operations).

2.1.6 Register Allocation in General

Assuming the target machine has registers (where by this we mean any kind of fast global storage locations), a decision must be made as to how these should best be employed. Since they will probably be in short supply, the priorities in this respect can be summarised as: provide first for the control/user stack pointers and for any implementation- dependent global usage; then allow sufficient registers for temporary working use; finally allocate any remaining registers as local variables for POPLOG procedures.

     By  'implementation-dependent  global  usage'  here  we  mean,  for
example,  the need to maintain one or more registers for base addressing
on machines that do not support PC-relative addressing. To  date  POPLOG
has  been  implemented  on only one such machine (the GEC Series 63): in
this implementation one register  is  designated  to  hold  the  current
procedure  address,  which  each  procedure sets up locally on entry and
thereafter uses as the base for accessing its literal  table,  executing
relative  branches,  etc. (In fact, this register is treated simply as a
'pop'-type local possessed by every procedure; it has to be, because its
saved  values  in  stack  frames  may  need  relocation  by  the garbage
collector.)

     Another possible example of such global usage concerns the constant
false, which, because all conditionals in POPLOG test equal/not equal to
it, is  generally  the  most  important  constant  in  the  system.  For
compatibility  with the representation of other data types, it has to be
a proper pointer to a data structure in memory; yet for efficiency,  one
would like its actual value to be zero, so that comparisions with it are
reduced to a zero/non-zero test.  This is only possible if false can  be
located  at address 0, but many systems do not allow this. An acceptable
alternative therefore is to cache its value in a global register.

     As regards registers for temporary working use, the number of these
required will depend  on the code  sequences generated by  POPC and  the
run-time assembler,  and  on  the  needs  of  the  hand-coded  assembler
routines. Usually, this will be at least 4.

John Gibson July 86

- 11 -

     Other registers remaining  can  be  allocated  as  procedure  local
variables,  either  as  'pop'-type or 'non-pop'-type. (As a rule, POPLOG
source code assumes that the first two 'pop' and the first three or four
'non-pop'  lvars  declared  in  a  procedure  will  reside in registers,
although there is no requirement for this other than  efficiency.)  This
allocation  must take into account the efficient saving and restoring of
register values in stack frames, in the order shown above.

Changes to the Source Code

This section overviews the actual source files that require changing to port POPLOG to a new machine, taking into account the considerations of section 2.1.

2.2.1 Hand-coded Assembler Files

The hand-coded assembler files contain a set of subroutines. These are either called explicitly from the sysPOP-11 source files, or implicitly from code generated by POPC or the Run-time assembler. They perform operations which are either too primitive or too machine/operating system dependent to be coded in sysPOP, or which are so central that they need to be as efficient as possible.

     Naturally, the routines in these files will require recoding for  a
new  CPU;  the detail of many of them also depends on the issues already
discussed, such as the  representation  of  POPLOG  items  and  register
allocation,  etc.  The  following  is  a  brief overview of the routines
contained in each file:

aarith.s (250 lines) Subroutines concerned with arithmetic and bitwise-logical operations on integers, e.g. multiplying and dividing simple integers in POPLOG representation, adding, subtracting, multiplying and dividing POPLOG big (arbitrary-precision) integers.

aextern.s (70 lines) Routines that provide an interface for POPLOG to call 'external' procedures, i.e. procedures that obey the standard machine/operating system calling protocol (normally encompassing operating system calls and procedures/functions written in conventional languages.) A central part of this interface is moving argument values from the POPLOG user stack (which is where POPLOG procedures pass them) to the system control stack (where the standard protocol expects them).

afloat.s (300 lines) A set of routines to perform arithmetic and other operations on floating-point data. These depend both on the host system floating- point formats and the machinery for processing them it makes available; a few of them also depend on the chosen encoding for

John Gibson July 86

- 12 -

POPLOG single-floats discussed above.

amain.s (500 lines) The largest file, containing a mixture of routines for various purposes. These include: the general mechanisms for 'calling' an object in POPLOG (i.e. execute it if it is a procedure, or execute its class_apply procedure otherwise), control routines concerned with abnormal procedure exits, and a number of others. Also contains the start-up routine for the system (main in Unix).

amove.s (350 lines) This is concerned with move and compare operations on blocks of memory, implementing them in the most efficient way possible on the host machine. It also contains two subroutines used by the VED editor for efficient displaying of screen lines.

aprocess.s (200 lines) Contains core routines for the POPLOG process mechanism. The principal routine in this file is one that swaps a process control stack section in or out (and is by far and away the most complicated of the hand-coded assembler routines). Other routines are concerned with saving and restoring the POPLOG user stack.

aprolog.s (200 lines) Defines core subroutines used by POPLOG PROLOG, including simple unification operations (e.g. unify against a constant), and operations that save and restore the current state of the PROLOG global variables. Some of these routines are optional in the sense that they are just optimised versions of sysPOP-11 procedures (and so can be omitted in an initial version of a port).

arestore.s (30 lines) A few routines concerned with resetting the memory configuration of the system when restoring a POPLOG saved image.

asignals.s (200 lines) Routines for handling signals and exception conditions; the details of these will depend entirely on the host operating system. In Unix, this file defines the routine specified to the signal system call (which is then called to process interrupts, memory access violations, etc).

2.2.2 POPC Code Generation

As described earlier, POPC is a (standard) POP-11 program which takes the rerouted output of the VM compiler and produces from it a file of symbolic assembler; we shall describe here only those aspects of the process which are relevant to a port.

     POPC takes the  list of  VM instructions for  each procedure  being
compiled  and  transforms  it  into   a  list  of  instructions  in   an
intermediate representation, called M-code, which corresponds roughly to
a multiple-operand machine instruction  set with generalised  addressing
modes  (in  fact,  based  orginally   on  the  VAX  architecture).   The
interpretation of the sysPOP-11  dialect is performed concurrently  with
this translation (so that,  for example, a VM  CALL instruction for  the
procedure _add is mapped directly onto the M-code instruction ADD).

John Gibson July 86

- 13 -

Following a considerable amount of optimisation (special attention being given to the elimination of unnecessary user stack pushes and pops), the M-code list is handed to the back-end code generator for translation to target machine assembler code.

     The back-end code generator has then to  be  rewritten  for  a  new
port.  It  is  defined  in  terms  of a set of procedures, each of which
handles the translation of instructions for a given M-opcode,  producing
appropriate   sequences   of  actual  machine  instructions.  The  final
assembler code list (possibly after further optimisation passes) is then
written to the output file.

     The POPC source  files  that  will/may  require  changing  are  as
follows:

asmout.p (150 lines) Defines procedures for outputting data (not instructions) in the appropriate symbolic assembler format (instruction output is dealt with solely by 'genproc.p'). It also defines the procedure which maps POPLOG identifier names onto assembler/linker symbols (the hand-coded assembler files must then follow the same conventions when referring to POPLOG identifiers). Used by all parts of POPC and POPLINK to generate structures other than procedures.

genfloat.p (100 lines) This file deals with the generation of floating-point constants, and may or may not need changing depending on the format of floats in the target system. Current versions handle IEEE, VAX and GEC Series-63 formats. (See also 'afloat.s' in 2.2.1.)

genproc.p (1100 lines) The back-end code generator, as described above. It performs the translation of M-code to target assembler, and writing of the resultant code to the output file. It must make various definitions required by 'm_trans.p' (the VM code to M-code translator), and in this respect is able to exercise some control over the exact definition of M-code instructions, particularly the registers and addressing modes used. The definitions also include the M-code meanings for sysPOP-11 operations like _int, _pint, _issimple, _iscompound, etc, which depend on the representation of POPLOG items discussed in 2.1.1.

sysdefs.p (100 lines) This file is unique to each POPLOG system, and defines (as POP-11 macros) various constants of the implementation. These include such things as machine word size and addressing units, virtual memory page sizes, etc, and conditional compilation switches for selecting alternate parts of the sysPOP source code where necessary.

2.2.3 Run-time Code Generation There are currently three files concerned with the run-time generation of procedures containing executable code (as discussed in 2.1.3). These are:

ass.p (1100 lines) This is the largest and by far the most complicated of the three; it

John Gibson July 86

- 14 -

consists of a set of procedures that plant executable code for each VM instruction, as well as the code required for creating stack frames on procedure entry and unwinding them on exit.

         In normal run-time mode  the  VM  compiler  builds  a  list  of
    instructions  for each procedure being processed and, when a list is
    complete for a given  procedure,  hands  it  over  to  the  run-time
    assembler.  (At  this  level,  the VM instruction set is an expanded
    version of that which appears in REF VMCODE, since it differentiates
    the  basic  instructions  according  to their actual arguments, e.g.
    there are different versions of the  CALL  instruction  for  calling
    variable, procedure-type variable or constant procedures.)

         The  interface  to   run-time   assembly   is   the   procedure
    Consprocedure, which takes as input the code list and other relevant
    information (such as details of  the  procedure's  local  variables,
    etc)  and  produces  the  final  procedure  record  as  its  result.
    Consprocedure makes one or more passes on the code  list,  and  each
    time  it  does  so it calls the procedures in ass.p corresponding to
    the instructions in the list. These procedures assume that a  global
    pointer  to  a  region  of  memory has already been set up, and each
    plants its piece of executable code at that place, incrementing  the
    pointer  appropriately.  As  far as possible, the setup is such that
    ass.p  need  only  contain  those  instructions  which  are  machine
    specific; other details are dealt with by Consprocedure.

arrays.p (300 lines) This file deals with the construction of array procedures by the procedure newanyarray. An array procedure is called with N subscripts as arguments, where N is the dimensionality of the array; its job is to compute the total subscript value for accessing the desired value in the 1-dimensional vector underlying the array, and then supply both subscript and vector to the appropriate subscripting procedure (which actually performs the access or update).

         Although array procedures could be implemented by  other  means
    that  do  not  require  generation of executable code, this would be
    less efficient; the code generated to compute  the  total  subscript
    can  be  highly  optimised,  and  can  make  use  of special machine
    instructions where appropriate (e.g. the VAX "index" instruction).

partapply.p (100 lines) This defines the procedure partapply, which constructs a closure procedure from a given base procedure and a number of 'frozen-value' arguments. The executable code inside the closure pushes the frozen values onto the user stack and then calls the base procedure (its pdpart). (Note that closures do not alter the control stack, i.e. they don't create stack frames.)

John Gibson July 86

- 15 -

3. Operating System-Specific Tasks ----------------------------------

This section considers issues that arise in porting POPLOG to a new operating system, whether in combination with a new CPU port or not. In fact, some of the matters already dealt with in previous sections may depend more on the details of the host O/S than on the CPU. This applies particularly to memory organisation, where it is the O/S that will determine what address space is available, how it is paged or segmented, how it can be extended or contracted, etc; also important is the O/S handling of things like exception conditions and interrupts. Variation in these areas can mean that a not insubstantial number of modifications may be necessary even when transferring the system to a slightly different version of an existing O/S (Unix being a classic example).

     Otherwise, the majority of operating system dependencies in  POPLOG
are related to I/O facilities and file handling, with a number of others
in areas such  as  process  control,  O/S  utility  calls  and  external
procedure loading. We shall deal with each of these in turn.

I/O in POPLOG

Essentially, the I/O facilities in POPLOG are modelled on those of the Unix operating system, providing uniform byte-stream access to all kinds of files and devices. The lowest-level I/O procedures available in the system ( sysopen, syscreate, sysread, syswrite, sysseek, sysclose) closely parallel Unix system calls ( open, creat, read, write, seek, close) and take similar arguments, except that instead of Unix file descriptors, the procedures sysopen and syscreate return a device record, which holds all necessary information for performing I/O on the file or host system device opened. In particular, the device record contains a set of host device-dependent procedures, one for each of the allowable operations (reading, writing, flushing, seeking and closing); the action of each "sys-" procedure applied to a given device is then to call the corresponding device procedure for that operation.

     Thus for I/O in a new O/S environment, the basic work necessary  is
to  provide  suitable  procedure  definitions  for  each  of  the  above
operations on each different kind of host device. In a  non-Unix  system
where  I/O is not organised around the byte-stream paradigm, this can be
far from straightforward; even in Unix these procedures do not (as might
be expected) consist merely of calls to the corresponding Unix routines,
because they provide a further level of buffering on top.

The relevant POPLOG source files in this area are:

devio.p (1000 lines) Defines device procedures for each combination of operation and host device type, as well as procedures that actually construct device records.

sysio.p (700 lines) Definitions of the sys- interface procedures for each operation, that take devices as arguments and call the appropriate 'devio.p' procedures stored in the devices.

John Gibson July 86

- 16 -

Additional files for the VMS implementation of POPLOG (where the byte- stream interface has to be simulated) are:

blkio.p (100 lines) Device procedures that simulate full byte-stream and random access on non-text disk files.

rmsio.p (500 lines) A set of procedures that serve as an interface between POPLOG and the RMS (Record Management System) software layer in VMS.

Other O/S Utilities

POPLOG makes available a number of other procedures that are closely linked to operating system utilities (again, usually based around Unix). Some of these are sufficiently general that they will probably have suitable counterparts in a given O/S (and will therefore be supportable), while others may not. Some are even specific to particular hosts (e.g. VMS, Berkeley 4.2 Unix, etc). The following source files and the procedures they contain will therefore need individual consideration:

systime.p (100 lines) Procedures concerned with dates and times, e.g. get the current date/time in Unix format (seconds since 1 Jan 1970), convert a time in this format to an ASCII string, return the CPU time since the start of the process, etc.

systimer.p (50 lines) Deals with setting and clearing interval-timer interrupts.

sysutil.p (500 lines) Miscellaneous procedures: POPLOG equivalents of Unix fork, wait and exec, etc, and corresponding routines in the VMS version of this file.

External Procedures

The external procedure facility in POPLOG enables functions and procedures in non-POPLOG languages to be loaded into and called from within the system, all at run-time. Roughly speaking, this is achieved by calling the host system linker (as a separate O/S process) to link one or more user object modules with the POPLOG base image. This results in the creation of an image file in which the code and data for the external procedures are located at an address determined by POPLOG, which then loads it into memory at the appropriate place.

     Naturally, much of this process depends on the details of the  host
O/S,  including object module/symbol table formats and the functionality
and capabilities of the linker. The sysPOP source code for this area  of
the  system  attempts  to  be  as  general  as possible, however, and is
divided into two files, one being (at least, nominally) host-independent
and the other encapsulating the dependencies:

John Gibson July 86

- 17 -

extern.p (600 lines) This file contains all the interface procedures ( external_load, external_apply, etc) for using external procedure in POPLOG. and calls procedures in one of the files below to perform all operations which depend on the details of the host O/S.

unixextern.p (700 lines) vmsextern.p (400 lines) The host-dependent parts for the Unix and VMS implementations respectively. These deal with object module/symbol table formats, calling the O/S linker, and loading the resultant image sections into memory.

malloc.c (300 lines) The one and only C source file in the system. It redefines the O/S library routines for allocating and freeing blocks of dynamic memory so that they use areas set aside by POPLOG, and are thus under POPLOG's control. In Unix, these are the C library routines malloc, free, etc; in VMS they are the same plus lib$get_vm and lib$free_vm. May need modification for a new O/S.

John Gibson July 86

OpenPoplog on SourceForge

University of Birmingham