description.text 140 KB

Edit Raw Blame History



                             Description of the Anubis Compiler


   This file contains  a description of the Anubis compiler together  with a discussion on
   how the next version of the compiler should be designed.


   -------------------------------- Table of Contents ------------------------------------

   *** (1) Overall description.
      *** (1.1) Modules of the compiler.
      *** (1.2) How it works.
         *** (1.2.1) Phase 1.
         *** (1.2.2) Phase 2.
      *** (1.3) Garbage collection.

   *** (2) Tools.
      *** (2.1) Lisp-like representation.
      *** (2.2) Exceptions to the Lisp-like 'Expr' representation.
      *** (2.3) Catalog of all internal data representations.
         *** (2.3.1) Internal representation of types.
         *** (2.3.2) Internal representation of terms.
         *** (2.3.3) Internal representation of interpretations.
         *** (2.3.4) Internal representation of type implementations.
         *** (2.3.5) Internal representation of virtual machine instructions.
            *** (2.3.5.1) Pseudo instructions.
            *** (2.3.5.2) Computing instructions.
            *** (2.3.5.3) Garbage collector instructions.
            *** (2.3.5.4) Closure instructions.
            *** (2.3.5.5) Equality instructions.
            *** (2.3.5.6) Dynamic variables instructions.
            *** (2.3.5.7) Serialising/unserialising.
            *** (2.3.5.8) Starting a new virtual machine.
            *** (2.3.5.8) Obsolete and soon obsolete instructions.
      *** (2.4) Unification.
      *** (2.5) Organization of the knowledge base.
      *** (2.4) Indexation of symbols.

   *** (3) The lexer.
      *** (3.1) Generalities.
      *** (3.2) Finding source files.
      *** (3.3) The 'read' stack.
      *** (3.4) Remembering already read files.
      *** (3.5) Computing visibilities.

   *** (4) The parser.

   *** (5) Computing interpretations of terms.
      *** (5.1) Overview.
      *** (5.2) Interpreting applicative terms.
      *** (5.3) Interpreting conditionals.
      *** (5.4) Interpreting selective conditionals.

   *** (6) Implementation of types.
      *** (6.1) Implementation of primitive types.
         *** (6.1.1) 'String'.
         *** (6.1.2) 'ByteArray'.
         *** (6.1.3) 'Int32'.
         *** (6.1.4) 'Float'.
         *** (6.1.5) 'Listener'.
      *** (6.2) Implementation of defined types.
      *** (6.3) Implementation of functional types.
      *** (6.4) Implementation of address types.
         *** (6.4.1) File and network address types.
         *** (6.4.2) Dynamic variable address types.
      *** (6.5) implementation of C structure pointer types.

   *** (7) Code generation.
      *** (7.1) General conventions.
      *** (7.2) Elimination of terminal recursion.
      *** (7.3) Generating the code.
         *** (7.3.1) Protect.
         *** (7.3.2) Lock.
         *** (7.3.3) Global symbols.
         *** (7.3.4) Strings, Int32, small data and Floats.
         *** (7.3.5) Local symbols.
         *** (7.3.6) Microlocal symbols.
         *** (7.3.7) Closures.
         *** (7.3.8) Applicative terms.
         *** (7.3.9) Conditionals.
         *** (7.3.10) Selective conditionals.
         *** (7.3.11) Local definitions ('with ...').
         *** (7.3.12) Delegate.
         *** (7.3.13) Serialize/Unserialize.
      *** (7.4) Peephole optimisation.

   *** (8) Garbage collector code.

   *** (9) Equality code.

   *** (10) Serialisation/unserialisation.
      *** (10.1) How it works.
      *** (10.2) Descriptions of small types.
      *** (10.3) Pseudo-instructions set.
      *** (10.4) Serialising.
      *** (10.5) Unserialising.

   *** (11) Outputing the code (making an *.adm file).

   *** (12) How to make a system of macros ?

   ---------------------------------------------------------------------------------------


   *** (1) Overall description.


      *** (1.1) Modules of the compiler.

   The Anubis compiler is made of the following main parts:

     - the lexer (lexer.y)
     - the parser (grammar.y)
     - a module for analysing type definitions (typedef.c)
     - a module for analysing data definitions (opdef.c)
     - the interpretations module (this not an interpreter) (interp.c)
     - the compiler proper (generation/optimization of symbolic code) (compile.c)
     - the module for computing the implementations of data types (implem.c)
     - the output module (making the '*.adm' files) (output.cpp)

   It also contains secondary modules. Some of them are always used:

     - a module for deciding if a type is recursive (rectype.c)
     - tools for working on data types (unify.c, typetools.c, typewidth.c, typecmp.c, unknowns.c)
     - computing the implicit destructors (destruct.c)
     - computing the deletion code (delcode.c)
     - computing the equality code (eqcode.c)
     - a module containing various tools for manipulating expressions (expr.cpp)
     - a system for managing the memory of the compiler itself (mallocz.c)
     - a set of messages and functions for printing messages (msgtexts.c, show.c)
     - a set of predefined paragraphs (predef.c, predef.anubis, predefined.anubis)
     - tools for computing the size and binary format of instructions (vminstr.c)

   Some others are used only on demand (options of the compiler):

     - making an HTML index of all public symbols (index.c)
     - outputing a C version of some Anubis types together with the constructors (dumpct.c)
     - making a 'polish' translation (polish.c)
     - producing a symbolic code dump (*.sc) (symcode.c)


      *** (1.2) How it works.

   The compiler works in two phases:

     - phase 1: reading source files, checking paragraphs, computing the interpretations.
     - phase 2: generating a module for each 'global' paragraph.

         *** (1.2.1) Phase 1.

   During  phase 1,  the  compiler reads  source files  and  does the  following for  each
   paragraph:

     1. reading  the paragraph  (using the  parser and  the lexer).   This  may eventually
        produce a syntax error.

     2. depending on the kind of paragraph:

        - 'read' paragraph:  checking if the  file refered to  has already been  read. The
        file is read only  if it has not already been read.   However, visibility is setup
        correctly. See the section on the lexer below.

        - 'replaced  by' paragraph: this  is the  same as  'read' paragraphs,  except that
        visibility is computed  differently. See the section on the lexer below.

        - 'type' paragraph: the  type definition is checked.  See  the section 'Checking a
        type definition' below.

        - 'define' paragraph: the definition is checked. See the section 'Checking a datum
        definition' below.

        - 'C constructor' paragraph. An Anubis type is translated into some program usable
        from within a C program. (the output,  for all those paragraphs, is written in the
        files 'construc.h.out' and 'construc.c.out').

   Only one error is considered per paragraph. If an error is found, the analysis restarts
   at the next paragraph.


         *** (1.2.2) Phase 2.

   Phase 2  is performed only  if there is  no error after phase  1. Phase 2  performs the
   following tasks:

     - check  if all type  definitions are  complete. Since  a type  may be  defined using
     several paragraphs, it is possible  to define incomplete types (all alternatives have
     not been defined).

     - determine which types are infinite (recursive).

     - Dump C versions of types (if required).

     - make the modules (*.adm).

     - dump implementations of types (if required).

     - print statistics (if required).

     - print possible errors after phase 2.


      *** (1.3) Garbage collection.

   Data are implemented in two different ways:

     - directly
     - indirectly

   'Directly' means that what we have in  hand is the datum itself, and 'indirectly' means
   that what we have in hand is a pointer to a segment of memory containing the datum.

   Data may  be 'created', 'duplicated' and  'destroyed'. When we duplicate  (or 'copy') a
   direct datum,  we have  no choice,  we just  copy all the  bits of  the datum.  When we
   duplicate an indirect datum, we have the choice between:

     - allocating a new segment of memory, and copying the content of the original segment
       into the new segment (this is called 'real' copy),
     - just duplicating the pointer (this is called 'virtual' copy).

   Notice that these  methods are semantically equivalent for  abstract data. For concrete
   data (for example: dynamic variables, connections, etc...), indirect implementation and
   virtual copy are required.

   Of course, virtual  copy is faster, but it creates 'shared  data'.  When a (necessarily
   indirect) datum  is shared, we  have several pointers  pointing to the same  segment of
   memory. Then, the problem is just to know  when the segment is no more used, so that we
   can  return it  to the  pool of  memory. This  problem is  easily solved  by 'reference
   counting', which works as follows.

   The first 32 bits word of any  allocated segment is reserved for reference counting. It
   countains an unsigned integer, called the 'counter' (for this segment). This counter is
   either:

      - zero, in which case, the segment is 'permanent' and will never be freed,
      - non zero,  in which  case its  value is the  number of  pointers pointing  to this
        segment.

   Permanent segments are  created in general (always ?) at compile  time. For example, if
   you write  a character string into your  source text, the compiler  creates a permanent
   segment for  holding this string (this segment  is within the program  code itself). Of
   course the counter of a permanent datum is never changed and remains always 0.

   Non   permanent   data  are   created   dynamically   at   run  time.    The   function
   'AllocateDataSegment' (see the  virtual machine source code) receives  the number of 32
   bits words  to allocate,  and returns the  pointer to  the segment (actually  casted to
   'U32'). The first 32 bits words of  the segment (the counter) is already initialized to
   1.

   When we need to  make a virtual copy of a (non permanent)  indirect datum, we just have
   to increment the  counter of this segment  (so that it always represents  the number of
   (significant) pointers to the segment).   Notice that this will never overflow, because
   an  overflow  would  mean a  number  of  pointers  bigger  than  the total  memory  can
   hold.

   When  we want  to  delete  a (non  permanent)  indirect datum,  we  perform a  'virtual
   deletion'.  This  means that we  decrement the counter.  If the counter becomes  0, the
   segment may be freed.

   But  freeing the  segment is  not as  simple as  this, because  the segment  itself may
   contain pointers  which are  counted within other  segments. Freeing the  segment means
   destroying  these  pointers,   so  that  we  must  also   decrement  the  corresponding
   counters. We see that this process  is recursive, because destroying a pointer may lead
   to decrement a counter to zero, and to free another segment, and so on ...

   Hence, the process of freeing the segments must be coded as a (recursive) program. This
   is what we call  'garbage collector code'.  This code is generated  by the compiler and
   depends only  on the  definition of  types.  Actually, there  is one  garbage collector
   function  for each  defined type  which  may contain  indirect data  (i.e. 'mixed'  and
   'large'  types). Some  other  types  (like functional  types)  require special  garbage
   collector functions.


   *** (2) Tools.


      *** (2.1) Lisp-like representation.

   The  Anubis compiler  uses a  particular style  of programming  inspired by  Lisp.  The
   advantage is  that this system is very  flexible.  The disadvantage is  that, being not
   typed, it is not very safe. I think I will never do that again. It works as follows.

   The  type  'Expr'  represents all  symbolic  'expressions'  to  be manipulated  by  the
   compiler. This includes  tokens generated by the lexer,  syntactical trees generated by
   the  parser,  interpretations  generated   by  the  interpretations  module,  contexts,
   environments, symbolic code, ...

   Actually 'Expr' is  just a clone of U32,  the type of unsigned 32  bits integers. These
   integers are 'dynamically' typed (using the low bits) as follows:

   0 mod 2 ---------------------------------> (signed) integers
   1 mod 2 --->  1 mod 32 ------------------> tags
                 3 mod 32 ------------------> temporary pairs
                 5 mod 32 ------------------> strings
                 7 mod 32 --->  7 mod 64 ---> user defined type variables
                               39 mod 64 ---> internal type variables (unknowns)
                 9 mod 32 ------------------> permanent pairs

   In other words:

      xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxx0       integer
      xxxxxxxx xxxxxxxx xxxxxxxx xxx00001       tags
      xxxxxxxx xxxxxxxx xxxxxxxx xxx00011       temporary pairs
      xxxxxxxx xxxxxxxx xxxxxxxx xxx00101       strings
      xxxxxxxx xxxxxxxx xxxxxxxx xx000111       user defined type variables (like $T)
      xxxxxxxx xxxxxxxx xxxxxxxx xx010111       unknowns
      xxxxxxxx xxxxxxxx xxxxxxxx xxx01001       permanent pairs

   Notice for  example, that pairs (both temporary  and permanent) have the  5 lowest bits
   inhibited. Hence, the maximal number of temporary pairs is 2^27. The same for permanent
   pairs. This makes  at most 134217728 pairs  of each sort. For temporary  pairs, this is
   not a problem, but  for permanent pairs, this may be a  problem.  Nevertheless, since a
   pair occupies  8 bytes, an  array of permanent  pairs containing 134217728  pairs would
   contain 2^30  bytes, i.e. 1  gigabyte. So, with  a machine of  less than 1  gigabyte of
   memory, no problem will happen (except that very big programs may not be compiled).

   A good program  should be protected against such problems independantly  of the size of
   available  memory. Examining  the  code  I see  that  it is  not  the  case for  Anubis
   1. Indeed, the function 'pcons' (defined in 'expr.cpp'), which is used for making a new
   permanent pair,  has no  protection against  an enlargement of  the array  of permanent
   pairs above the limit. This should be fixed as soon as possible. (Similarly, 'cons' for
   temporary pairs is not well protected).


   All  'Expr's except temporary  and permanent  pairs are  called 'atoms'.  Pairs, either
   temporary or permanent, are just made  of two 'Expr's, and represented (in the comments
   of the source of the compiler), as follows:

                                              (a . b)

   where 'a'  and 'b' are  two 'Expr's  respectively called the  'head' and 'tail'  of the
   pair. The macro 'cons' enables the construction of such pairs.

   As in  Lisp, lists are represented  using pairs and the  tag 'nil' (the  empty list) as
   follows (for a list made of 4 'Expr's):

                                             (a b c d)

   which is actually a shortcut for:

                                     (a . (b . (c . (d . nil))))

   Such lists may be easily constructed  using the macros 'list1', 'list2', ... defined in
   'compil.h'.

   Now, an  'Expr' like (a .   (b .  (c .   d))) is written '(a  b c .  d)',  and called a
   'multiple  cons'. A  set of  macros, called  'mcons3', 'mcons4',  ...  enable  a direct
   construction of such data.

   Examples of use and meaning of these macros (and some others):

     The C term                         represents
     ---------------------------------------------------------------
     nil                                ( )
     cons(a,b)                          (a . b)
     mcons3(a,b,c)                      (a b . c)
     mcons4(a,b,c,d)                    (a b c . d)
     etc...
     list1(a)                           (a)          same as (a . nil)
     list2(a,b)                         (a b)        same as (a b . nil)
     etc...

     If x = (a b c d e):
     car(x)                             a
     second(x)                          b
     third(x)                           c
     etc...

     cdr(x)                             (b c d e)
     cdr2(x)                            (c d e)
     cdr3(x)                            (d e)
     etc...

     Notice that 'mcons2' does not exist because it is the same a 'cons'.

   'consp(x)' is 1  if and only x is a  pair (i.e. not an atom). Otherwise,  it is 0. This
   predicate ('consp'  is a  traditional Lisp abbreviation  for 'cons predicate')  is very
   often used as follows:

      while(consp(x))
        {
           Expr head = car(x);

           ... do something with the head of x ...

           x = cdr(x);    // continue with the tail of x
        }
        // in most cases, when we arrive here, we have x == nil (the empty list).


   Normally,  pairs  are  created  temporary,  and  transformed  into  permanent  only  if
   needed. The  question is that,  while working on  a paragraph, the  compiler constructs
   data which will  not be part of the  final result. When the analysis of  a paragraph is
   finished, only the data to be kept  are made permanent (using the function 'save'), and
   the array  of temporary  pairs is reset  to zero  for the next  paragraph. This  is how
   memory is managed in the compiler (this  has nothing to do with garbage collection in a
   virtual machine).

   Integers are coded on 31  bits (bits 1 to 31), and the bit number  0, whose value is 0,
   means  that this 'Expr'  is an  integer. The  file 'compil.h'  contains all  the macros
   needed to test the type of an 'Expr', or to construct 'Expr's.

   Tags  are used  for representing  simple atomic  concepts. For  example, the  tag 'nil'
   represents  the empty  list. They  are also  used for  'tagging' non  atomic  data. For
   example, when the function

                                    (Int32 i, String s) |-> a

   is read from a source file, the parser constructs the following 'Expr':

                      (lambda <lc> (([Int32] . i) ([String] . s)) . [a])

   where '[Int32]'  is a  representation of  type 'Int32' (similarly  for 'String')  , and
   '[a]' a representation of the term 'a'.

   Here, the tag 'lambda' represents the arrow '|->'.


      *** (2.2) Exceptions to the Lisp-like 'Expr' representation.

   Actually, this system has exceptions. Normally, all tokens returned by the lexer have a
   value of type 'Expr', i.e. a datum  containing its own (dynamic) type. Examining such a
   value it is possible to know if we have an integer, a string, a pair, etc...

   However, the two tokens 'yy__integer' and 'yy__char' have a value of type 'integer'.

   This means that when the lexer  reads a number (regular expression: [0-9]+), it returns
   the token  'yy__integer', with a  value to be  interpretated as a  C integer, not  as a
   Lisp-like integer. We must be careful  because for example the integer 6 is represented
   like this in the two systems:

          ...0110           (C integer: this is not an 'Expr')
         ...01100           (Lisp-like integer: this is an 'Expr')
                ^
                |
                +--------- this bit (0) means that this 'Expr' is an integer

   The reason for these exceptions is that we want the compiler to be able to read 32 bits
   integers, not only 31 bits integers from source files.

   The parser contains the following grammar rule:

      Term:    yy__integer    %prec prec_symbol  { $$ = mcons3(integer,linecol(),$1); }

   This means that when  an integer has been read by the  lexer, the parser constructs the
   following multiple cons as a representation of this integer:

                       (integer <lc> . <n>)

   where <n> is the value of the integer as a C integer, not as a Lisp-like integer.

   As a consequence, we must always remember  that a list whose head is the tag 'integer',
   has always  the form: (integer  <lc> . <n>),  where <n> is not  an 'Expr'. It  would be
   meaningless to  apply a  predicate like 'is_integer'  to <n>,  and the result  would be
   unpredictable.

   The exceptions are the following (where only <Cint> is not an 'Expr'):

      (integer <lc> . <Cint>)
      (small_datum <type> . <Cint>)
      (anb_int32 <lc> . <Cint>)
      (load_int32 . <Cint>)

   (I hope there is no other exception, but I'm not absolutely sure).

   Of course, the existence  of these exceptions is very bad. A  better solution should be
   to represent (say) the integer 746542 as the list  (7 4 6 5 4 2), where the elements of
   the list would have been Lisp-like  integers. In the compilation step, the compiler may
   transform this  representation into a packed integer  of 32, 64 bits,...   maybe with a
   warning in the case of an overflow. Actually, this is what happens already at a smaller
   scale. Indeed,  the current  version of  the compiler finds  3 interpretations  for the
   integers 180 (of types  'Int8', Int32' and 'Float'), but only 2  for 4567, because this
   number is greater that 255.


      *** (2.3) Catalog of all internal data representations.

   The compiler represents the following data (internally):

     - 'types' (as produced by the parser)
     - 'terms' (as produced by the parser)
     - 'interpretations' (as produced by 'term_interpretations' and related functions)
     - 'implementations' (as produced by 'type_implementation' and related functions)
     - 'instructions' (as produced by 'compile_term' and related functions)

   Note: All  data are represented  either by Lisp-like  atoms (string, tags, ...),  or by
   lists. In the case of a list, the head of list is always a tag.


         *** (2.3.1) Internal representation of types.

   <type> :=

   yy__Symbol                        just a type name in the form of a Lisp-like string

      For  example, the  types  'One', 'Bool'  etc ...   are  in this  category. The  file
      'compil.h' contains the  definitions of a lot of 'predefined  strings'. Within the C
      source, such symbols  may be represented by the  C symbols 'pdstr_One', 'pdstr_Bool'
      etc... ('pdstr_' for 'predefined strings').

      Equivalently, 'pdstr_Bool' maybe replaced  by 'new_string("Bool")'. This has exactly
      the same effect, except that 'pdstr_Bool' is faster at execution (of the compiler).

      Remark: 'new_string' (defined in 'expr.cpp') does not allocate a new segment for the
      string if the  string already exists. In this case, it  returns the already existing
      Lisp-like string.

      All  types  (without parameters)  whose  names  are created  by  the  user are  also
      represented like this.


   yy__utvar                         user type variable (type parameter)

      Type parameters have a special Lisp-like representation as atoms (see above).

   type_String                       a Lisp-like tag representing the type 'String'
   type_ByteArray                    idem
   type_Int32                        idem
   type_Float                        idem
   type_Listener                     idem

      The above being more 'primitive', they are represented by Lisp-like tags, instead of
      Lisp-like strings.

   (type_RAddr . <type>)             represents 'RAddr(T)'; type_RAddr is a Lisp-like tag
   (type_WAddr . <type>)             idem
   (type_RWAddr . <type>)            idem
   (type_GAddr . <type>)             idem (obsolete: global variable)
   (type_Var . <type>)               idem (dynamic variable)
   (type_MVar . <type>)              idem (multiple dynamic variable)

      The above are called 'address types'.

   (app_ts <name> . <types>)         type scheme applied to type operands

      'app_ts' is for 'apply type scheme'. For example, 'List(String)' is represented as:

                              (app_ts new_string("List") type_String)

      and 'Result(One,String)' as:

                         (app_ts new_string("Result") pdstr_One type_String)


      Notice  that  there is  no  dot in  the  above  Exprs.  '(app_ts  new_string("List")
      . type_String)' is erroneous, because the second tail ('cdr2') of the list must be a
      list of types, not a single type.

   (type_struct_ptr . <id>)          encapsulated  C structure type

      <id> is a Lisp-like integer identifying the structure.

   (functype <types> . <type>)       functional type

      <types> is the list of the representations of the arguments types, and <type> is the
      representation of the target type.


   There   is  still   another  sort   of  'types'   represented  internally:   the  'type
   unknowns'. They are also  a special sort of Lisp-like Exprs. They  are not generated by
   the parser,  but in general  by the interpretation  functions. When a symbol  is found,
   representing a  scheme, all  the type variables  in this  scheme are replaced  by fresh
   unknowns. Of course, there is a generator of fresh unknowns: 'fresh_unknown'.

   From this we may conclude that defined  types are those which are represented either by
   a string or by (app_ts ...).


         *** (2.3.2) Internal representation of terms.

   Note:  <lc>   (generated  by   the  lexer)   is  a  packed   version  of   the  triplet
   (filename,line,column). There are macros for extracting the components:

       file_in(<lc>)            a C char *
       line_in(<lc>)            a C int
       col_in(<lc>)             a C int


   As usual, <Cint> represents a C integer (i.e. not a Lisp-like integer).

   <term> :=

   (alert <lc> . <filename>)           <filename> is a Lisp-like string
   (protect <lc> . <term>)
   (lock <lc> <term> . <term>)
   (debug_avm <lc> . <term>)
   (avm <lc> . AVM)                    (AVM is explained below)
   (terminal <lc> . <term>)
   (symbol <lc> . <Lisp-like string>)
   (integer <lc> . <Cint>)
   (fpnum <lc> <mantissa> . <exponent>)    (<mantissa> and <exponent> are Lisp-like integers)
   (of_type <lc> <type> . <term>)          (explicit typing)
   (string <lc> . <string>)                (<string> is a Lisp-like string)
   (with <lc> <symbol> <value> . <term>)
   (delegate <lc> <term> . <term>)
   (lambda <lc> <FArgs> . <term>)               (|->)
   (rec_lambda <lc> <name> <FArgs> . <term>)    (|-f->)
   (app <lc> <term> . <terms>)                  (applicative term)
   (cond <lc> <term> . <cases>)                 (conditional)
   (select_cond <lc> <term> <case> . <term>)    (selective conditional)
   (serialize <lc> . <term>)
   (unserialize <lc> . <term>)

   The following are still used, but should become obsolete (and replaced by system calls).

   (anb_read ...)
   (anb_write ...)
   (anb_exchange ...)
   (connect_to_file ...)
   (connect_to_IP ...)
   (wait_for ...)


         *** (2.3.3) Internal representation of interpretations.

   The  function   'term_interpretations'  (in  'interp.c')  computes  the   list  of  all
   interpretations of  a term  (in a  given appropriate context).  An interpretation  is a
   pair:

                                          (head . env)

   where 'head'  is called an  'interpretation head', and  where env is  an 'environment'.
   Here, we describe interpretation heads.

   Unfortunately, I've used the same tags as for terms. It would be safer to use different
   tags.  Furthermore, tags for some sort of expressions should have a uniform prefix, for
   example:

      typ_           for types
      ter_           for terms
      int_           for interpretations
      imp_           for implementations
      ins_           for instructions

   The tags  should also be numbered.  This will limit the  risk of forgetting a  tag in a
   switch. Hence for example, we should have the following tags for terms:

     ter_00_alert
     ter_01_protect
     ter_02_lock
     etc...

   and a dummy one for marking the end of the list:

     ter_20_dummy


   Interpretations heads are the following:

   (protect <lc> . <head>)
   (lock <lc> <term> . <term>)
   (avm <lc> . <instructions>)
   (debug_avm <lc> . <head>)
   (terminal <lc> . <head>)
   (operation <lc> <opid> <name> <parms> <type> . <types>)
   (string <lc> . <string>)
   (anb_int32 <lc> . <Cint>)
   (small_datum <type> . <Cint>)
   (fpnum <lc> <mantissa> . <exponent>)
   (local <name> <i> . <type>)
   (micro_local <name> <d> <i> . <type>)
   (closure <lc> (f_micro_ctxt <fname> <ftype> (<sym> . <type>) ...) <args> . <body>)
   (app <lc> <head> . <heads>)
   (cond <lc> <head> ((<name> (<sym> . <type>) ...) <lc> . <head>) ...)
   (select_cond_interp <lc> <test head> <index> <clause head> <head then> . <head else>)
   (with <lc> <symbol> <head> . <head>)
   (delegate <lc> <head> . <head>)
   (serialize <lc> . <head>)
   (unserialize <lc> <type> . <head>)

   The following are still used but should become obsolete:

   (anb_read ...)
   (anb_write ...)
   (anb_exchange ...)
   (wait_for ...)


         *** (2.3.4) Internal representation of type implementations.

   Type implementations are computed in 'implem.c'. We never compute an interpretation for
   a scheme,  but only for an  actual type (i.e.   with no type parameter).   For example,
   'Maybe(Int8)' has an implementation different from 'Maybe(String)', but 'Maybe($T)' has
   no implementation at all.

   In the future, it would be nice to have quantification 'à la Girard' on types:

         forall $T ...
         exists $T ...

   In this case, we would have to implement for example the type:

         forall $T Maybe($T)

   which has no parameter, since $T is bound in this expression.

   Internal representations of implementations are  the following. Again the same tags are
   used, which is quite unfortunate.


   <implem> :=


   type_String
   type_ByteArray
   type_Int32
   type_Float

   (functype <implem> . <implems>)

   (type_RAddr . <implem>)
   (type_WAddr . <implem>)
   (type_RWAddr . <implem>)
   (type_GAddr . <implem>)       (obsolete)
   (type_MVar . <implem>)

      It  is very unfortunate  that dynamic  variables are  not implemented  as '(type_Var
      . <implem>). Instead,  this type is implemented as a large  type. I don't understand
      how  this can work  properly (but  it seems  to work  miraculously). This  should be
      changed of course.

   (type_struct_ptr . <id>)

   (small_type <nalt> <iw> <alt geom> ... <alt geom>)
   (mixed_type <nalt> <iw> <alt geom> ... <alt geom>)
   (large_type <nalt> <iw> <alt geom> ... <alt geom>)

   where:

   <alt geom> :=

   (small_alt (<imp> <offset> . <width>) ... (<imp> <offset> . <width>))
   (mixed_alt (<imp> <offset> . <width>) ... (<imp> <offset> . <width>))
   (large_alt (<imp> <offset> . <width>) ... (<imp> <offset> . <width>))


         *** (2.3.5) Internal representation of virtual machine instructions.


            *** (2.3.5.1) Pseudo instructions.

   These instructions are of size 0.

   (label . <addr>)             position within the code
   (ret_point . <addr>)         same as label
   (header ...)                 used for decorating the .sc files
   (comment ...)                idem
   (context ...)                idem


            *** (2.3.5.2) Computing instructions.


   (glue_index . <index>)                      used for small types
   (glue_mixed_index . <index>)                used for mixed types
   (store_index . <index>)                     used for large types

   These instructions are used  for constructing a datum of a defined  type.  They put the
   right index (i.e. alternative number of the datum) into the datum.

   For a  small datum, 'glue_index' puts the  index within the least  significant bits (of
   R). This is the  first operation when constructing the datum, so  that the content of R
   is actually overwritten.

   For  a  mixed  datum,  the  index  is   put  into  the  2  least  significant  bits  by
   'glue_mixed_index'.

   For a large datum, the index (a byte)  is put at offset 4 (just after the counter) into
   the segment.


   (glue . <bit width>)                    used for small and mixed types
   (store_0 . ?)                           used for mixed and large types
   (store_1 . ?)
   (store_2 . ?)
   (store_4 . ?)

   These  instructions  are  used  for   putting  the  components  into  the  datum  under
   construction (of some defined type).

   The instruction '(glue . <bit width>) ORes the  register R with a left shift of the top
   of stack. The number of bits of the shift is given by the operand.

   '(store_? . <offset>)' stores a  word of 0, 1, 2 or 4 bytes at  the offset given by the
   operand within the segment pointed to by R.


   (index_direct . <bit width>)
   index_indirect

   These instructions  are used  by conditionals.   A datum of  a defined  type is  in the
   register  'R', and  they put  its index  (alternative number)  into the  index register
   'I'. The first one is used for small and mixed data, the second one for large data.


   (switch <addr> ... <addr>)

   This  instruction check  the  content  of the  index  register 'I',  and  jumps to  the
   corresponding address. It is used by conditionals for jumping at the right case.


   (unglue <bit width> . <right shift>)
   (unstore <offset> . <n>)      where <n> is a Lisp-like integer equal to 0, 1, 2 or 4
   (unstore_copy . <offset>)
   (unstore_copy_mixed <offset> . <mask>)
   (unstore_copy_ptr . <offset>)
   (unstore_copy_function . <offset>)

   These instructions are used by conditionals for pushing the values of resurgent symbols
   onto the stack.

   '(unglue <bit width> . <right shift>)' is  used for extracting a component from a small
   datum. The datum is pushed into the stack.

   'unstore' instructions are similar but for indirect  data.  Some of them need to make a
   virtual copy of the datum to be extracted.


   (jmp . <addr>)

   This simple 'jump' instructionn is used at the end of cases in conditionals for jumping
   to the next code (which is common to all cases).


   (apply . k)
   (ret . k)

   When '(apply . k)' is on the point  to be executed, the stack contains (starting at top
   of stack) the arguments of a function  and a return address. '(apply . k)' The function
   to be 'applied' to these arguments is in 'R'.

   Thisn instruction acts  in two different ways depending on the  nature of the function,
   which  may be  either 'top  level' or  a  'closure'. See  the section  on closures  and
   functions for more explanations.

   '(ret . k)'  is just an ordinary  return instruction (except that is  stops the virtual
   machine if the return address is zero).

   'k' is the number of arguments of the function.


   (load_Int32 . <Cint>)
   (load_float <mantissa> . <exponent>)
   (string . <string index>)

   These instructions are used for loading constants into 'R'.


   (check_stack . <n>)

   This instruction checks if the stack is  big enough for pushing '<n>' 32 bits words. If
   it is not, it puts the virtual  machine in the state 'needs_bigger_stack' and gives up,
   so that the scheduler may enlarge the stack of this machine.


   (push_addr . <addr>)

   This instruction pushes a return address on top of stack.


   (address . <addr>)

   The same but the address is put into 'R'.


   push

   This instruction pushes the content of 'R' onto the stack.


   (select_index_direct <bit width> <index> . <addr>)
   (select_index_indirect <index> . <addr>)

   These instruction are generated for selective conditionals.


   (peek <name> . <depth>)
   (micro_peek <name> <depth> . <micro depth>)

   The instruction 'peek' takes  a 32 bits word at depth '<depth>'  into the stack and put
   it  into 'R'.  The  datum is  not virtually  copied.  The compiler  generates a  'copy'
   instruction to follow this one, if needed, depending on the type of the datum.

   'micro_peek' is  similar except that  the datum to  be put into  'R' is located  into a
   micro context. The micro  context is at depth <depth> into the  stack, and the datum is
   at position <micro depth> into the micro context.


   protect
   (unprotect . <addr>)


   unlock
   lock


   odd_align

   This instruction is put in front of the code of a top level function if this code would
   be at an even address.


   pop1
   pop2
   pop3

   These instructions are just removing 1, 2 or 3 32 bits words from the top of stack.

   invalid

   This instruction  should never be  executed. It is  generated in the  garbage collector
   code for small alternatives in mixed  types. This means that the garbage collector code
   must never be called for a datum  belonging to small alternative of a mixed type. Other
   instructions with a <mask> operand must use this mask to avoid such a call.

   give_up

   This  instructions make  the  virtual machine  give  up as  if its  slice  of time  was
   finished.


   start_debug_avm
   stop_debug_avm

   These instructions  set and unset  a flag.  When this flag  is set the  virtual machine
   prints all instructions executed with other informations.


   finish

   (program . <instructions>)

   do_alert


            *** (2.3.5.3) Garbage collector instructions.

   The  null 32  bits word  '0' is  always a  valid datum  (of any  type) for  the garbage
   collector.   Of course,  this  datum doesn't  need  to be  collected,  but the  garbage
   collector may be passed such a datum.


   Allocation of memory:

      (alloc . <number of bytes>)

   The pointer to  resulting segment is put in  R. The counter (first 32 bits  word of the
   segment) contains always 1 after the allocation.


   Making virtual copies. The following instructions are used for making a virtual copy of
   a  maybe indirect  datum.   This  amounts to  increment  the counter  which  is at  the
   beginning of the  segment.  However, these instructions are generated  on behalf of the
   type of the datum only, and this is not enough to know if the counter exists (some data
   are direct).  Hence,  the existence of several variants  for these instructions.  Also,
   these instructions always check  if the counter is zero.  If it  is the case, the datum
   is permanent, and the counter is not incremented.

   Copying virtually within the stack. The datum is at position <depth> in the stack (0 is
   the top of stack).

      (copy_stack_ptr . <depth>)

   The  counter always  exists (except  if the  datum is  the null  word). The  counter is
   incremented if it is non zero.

      (copy_stack_function . <depth>)

   The datum reprensents a function, i.e. the address  of some piece of code if it is odd,
   and a pointer to a closure if it is event. The counter exists only in the second case.

      (copy_stack_mixed <depth> . <mask>)

   The <mask> has at  most 4 bits. Each bit indicates if  the corresponding alternative is
   large or small. The counter exists only if the alternative is large.

   Virtually copying in R. This is the same except that the datum to be copied is in R.

      copy_ptr
      copy_function
      (copy_mixed . <mask>)

   The next one should become obsolete:

      copy

   It's  almost the  same  as  'copy_ptr' except  that  it does  not  check for  permanent
   data. Should disappear and be replaced by 'copy_ptr'.


   Virtual deletion within the stack.

      (collapse . <depth>)

   This instruction  deletes the datum  at position <depth>  into the stack. The  datum is
   direct, so that the instruction has just to  move the <depth> words at top of stack one
   slot downwards,  and adjust the  stack pointer.  For  example, (collapse . 3)  works as
   follows:

          the stack before                          the stack after
          a b c d e f g ....                          a b c e f g ...

   ('d' has disappeared)

   The next  instructions delete virtually a datum  which is in R.  These instructions are
   generated only for thoses types who have indirect data.

      del_ptr

   'del_ptr' expects a pointer in R to a segment not containing any indirect data. In this
   case, the  deletion amounts to decrement  the counter pointed  to by R. However,  R may
   contain 0 (in this  case the instruction does nothing), or the  counter may be zero (in
   this case, the datum is permanent  and the instruction does nothing). Otherwise, if the
   counter becomes zero, the segment is freed.

      del_conn

   This  is  similar  to  'del_ptr'  except  that the  segment  pointed  to  represents  a
   'connection' (file or network). The instruction  works as 'del_ptr', except that if the
   counter becomes zero, it also closes the connection.

      del_function

   This is  similar to 'del_ptr', except that  functions are either addresses  of code (if
   odd) or  pointers to closures (if  even).  In the first  case nothing is  done.  In the
   second  case, 'del_function'  works  like 'del_ptr',  except  that the  address of  the
   deletion  code is  found within  the closure  itself instead  of as  an operand  of the
   instruction. This  is because, the  types of  the data in  the micro context  cannot be
   known from the type of the function.

      (del . <addr>)

   '(del  . <addr>)' expects  a datum  belonging to  a large  type in  R. '<addr>'  is the
   address of  the deletion code for  this type. The instruction  checks if R  is zero. In
   this case it does  nothing.  Otherwise, R contain a pointer to  a valid segment. If the
   counter is zero (permanent datum)  the instruction does nothing. Otherwise, the counter
   is decremented.  If the counter becomes zero, a return address and the datum (in R) are
   pushed onto the stack, and the deletion code at '<addr>' is jumped to.

      (del_mixed <mask> . <addr>)

   This is  the same as  '(del . <addr>)' except  that the datum  in R belongs to  a mixed
   type. The mask is  used to check if the datum is direct or  indirect.  If it is direct,
   nothing is done. If it is indirect, the same as for '(del . <addr>)' is performed.

      (del_struct_ptr . <id>)

   This instruction is  the same one but for C  structure encapsulation pointers. Somewhat
   curiously, it is coded in the compiler ('compile.c'), but has never been generated (the
   proof:  it is  not  coded in  'vminstr.c').   This is  probably due  to  the fact  that
   encapsulated C structures are  manipulated only from within 'predefined.anubis'. Hence,
   this is not a big problem,  because only a modification of 'predefined.anubis' can lead
   to generate this instruction.

      (del_mvar . <addr>)

   This  instruction  is  even more  phantomatic  than  the  previous  one. This  is  more
   problematic,  because multiple  variable  may  be manipulated  by  anyone. Hence,  this
   instruction should be coded as soon as possible.

   Note:  We should  also have  a '(del_var  .  <addr>)'.   The type  'Var' is  very badly
   implemented as  a large type (one of  my worst ideas). Everything  concerning this type
   should be  examined and corrected. I suspect  that this creates at  least memory leaks,
   but maybe also more serious problems.

      (del_stack_ptr . <depth>)
      (del_stack_conn . <depth>)
      (del_stack_function . <depth>)
      (del_stack <depth> . <addr>)
      (del_stack_mixed <depth> <mask> . <addr>)
      (del_stack_struct_ptr <id> . <depth>)
      (del_stack_var <depth> . <addr>)           (phantom)
      (del_stack_mvar <depth> . <addr>)

   The above instructions are  the same as the previous ones, except  that the datum to be
   virtually deleted is in the stack at depth <depth> instead of being in R.

      mvar_slots_del_ptr
      mvar_slots_del_conn
      mvar_slots_del_function
      (mvar_slots_del . <addr>)
      (mvar_slots_del_mixed <mask> . <addr>)
      (mvar_slots_del_struct_ptr . <id>)
      (mvar_slots_del_var . <addr>)
      (mvar_slots_del_mvar . <addr>)

   Again,  these instructions  are  the same  as above  except  that they  have to  delete
   virtually the data in all the slots of the multiple variable.


      indirect_del_ptr
      indirect_del_conn
      indirect_del_function
      (indirect_del . <addr>)
      (indirect_del_var . <addr>)               (phantom)
      (indirect_del_mvar . <addr>)
      (indirect_del_mixed <mask> . <addr>)
      (indirect_del_struct_ptr . <id>)

   Again,  these instructions  are working  like the  above except  that the  datum  to be
   virtually deleted is pointed to by the top of stack.

      (increment_del . <byte width>)

   This instruction is used  when we are really deleting an indirect  datum.  A pointer to
   some position within the segment to be  freed is on top of stack. This instruction just
   increments this pointer of the number of bytes indicated by its operand.

      del_index_direct

   This instruction expects a datum of a mixed  type on top of stack, belonging to a large
   alternative. This datum is a 32 bits word which may be decomposed as follows:

         pppppppp pppppppp pppppppp ppppppii

   i.e into a field of 30 bits and a field of 2 bits. The two bits 'ii' give the number of
   the alternative the datum belongs to, and

         pppppppp pppppppp pppppppp pppppp00

   is the pointer to the segment. The role of the instruction is:

       - put 'ii' into the index register I
       - replace 'ii' by '00' on top of stack
       - duplicate the top of stack.

   The reason why  we duplicate the top of  stack is that one pointer  will be incremented
   for walking into the segment, and the other one will be used for freeing the segment.

      del_index_indirect

   This  instruction is similar  to 'del_index_direct',  except that  it concerns  a large
   datum.  The  index is got from  the data segment,  not from the manipulation  word. The
   pointer is also duplicated on top of stack.

      free_seg_0
      free_seg_1

   These two  instructions are just calling  the 'FreeDataSegment' function  of the memory
   allocator. The first  one transmits the word at  *(SP-1) (top of stack: depth  0 in the
   stack), whilst the second one transmits the word at *(SP-2) (depth 1 in the stack).


            *** (2.3.5.4) Closure instructions.

   These instructions are used for constructing new closures.


   (put_copy_direct <depth> . <position>)
   (put_copy_indirect <depth> . <position>)
   (put_copy_function <depth> . <position>)
   (put_copy_mixed <mask> <depth> . <position>)

   A memory  segment (the closure under construction)  is pointed to by  'R'.  The <depth>
   operand is  the stack  depth of the  datum to  be put.  The  <position> operand  is the
   position where the  datum must be put in  the micro context of the  closure. The offset
   (in bytes)  within the closure  where the datum  must be put is  4*(3+<position>), i.e.
   <position> is 0 for the first datum of the micro context, 1 for the next one, etc...

   For indirect data, the counter is incremented, because this is a duplication.


   (put_micro_copy_direct <depth> <micro depth> . <position>)
   (put_micro_copy_indirect <depth> <micro depth> . <position>)
   (put_micro_copy_function <depth> <micro depth> . <positions>)
   (put_micro_copy_mixed <mask> <depth> <micro depth> . <position>)

   Same as above, except that the datum  is located into another micro context which is at
   depth <depth> in the stack. The datum itself is at position <micro depth> in this other
   micro context. Again <micro depth> is 0 for the first datum in the micro context, 1 for
   the next one, etc...


   (put_closure_labels <code> . <del code>)

   This instruction puts its two operands at  offsets 4 and 8 into the closure (pointed to
   by 'R').

   Below is an example  of how a closure is constructed. First of  all, here is the source
   text (copy-pasted from the sources of our web site):

           (DBRow(MTxt) dbrt) |-> if dbrt is dbrow(_,_,t) then
                                  if t is mtxt(_,v,time) then
                                  success(mtxt(new_name,v,time))

   and below is  the code generated for the construction of  the closure (copy-pasted from
   the .sc file):


         |       ; |-> in '/home/alp/www_dev/anubis-language/_1_6/translate_item.anubis' at line 451
         |       ; function code at label 31092
  146195 |    alloc 16
         |       ;--- stack ------------------------
         |       ;   0                            (return address)
         |       ;   1     i                      LangInfo
         |       ;   2.0   tdb                    DB(TDB)
         |       ;   2.1   old_name               String
         |       ;   2.2   new_name               String
         |       ;   3                            (return address)
         |       ;----------------------------------
  146197 |    put_micro_copy_indirect 2 2 0
  146201 |    put_closure_labels 31092 31093

   'alloc 16' allocates 4  32 bits words for the closure: counter,  code, del and just one
   datum in the micro context (the value of 'new_name').

   Remark that 'new_name'  is the sole local  symbol occuring free in the  above term, and
   that its value  is within an older micro  context located at depth 2 in  the stack. The
   value of  'new_name' is at position  2 in this older  micro context, so  that the right
   instruction for constructing the new micro context is:

  146197 |    put_micro_copy_indirect 2 2 0

   which means: 'take the datum at position 2  in the micro context at depth 2, and put it
   at position 0 in the new micro context' (of course, count a virtual copy).

   ('indirect' because the type is 'String').  The last instruction:

  146201 |    put_closure_labels 31092 31093

   puts the address (label  31092) of the code for executing the  function and the address
   (label 31093) of the garbage collector code for deleting the closure itself.


            *** (2.3.5.5) Equality instructions.

   In the next version of Anubis, equality  code should be generated as generic code (like
   for implicit  destructors), so  that all the  instructions below will  become obsolete.
   Furthemore,  equality (with  Boolean value)  should be  generated only  for 'separable'
   types,  i.e.  type  for  which this  boolean  equality, in  the  mathematical sens,  is
   computable. From a  Topos theoretic view point, this means that  it should be generated
   only for types 'T' such that the internal equality '(T,T) -> Omega' factors through the
   canonical arrow 'Bool -> Omega' (sending  'true' onto 'true' and 'false' onto 'false').
   However, I don't know if there is an algorithm able to decide for this property. I know
   only a  recursive class of  types having this  property.  Of course,  internal equality
   exists for all (abstract) types and may be defined by Leibnitz rule.

   In Anubis 1, equality code is generated  for all types (except 'Float', but this may be
   the result  of some  misunderstanding). For concrete  types (address types  and several
   primitive types, structure  pointer types) equality is physical  (compare the addresses
   of the data).  For functional types it  is also physical, despite the fact that in some
   cases, we  could be  more clever (when  we know that  the product  of the types  of the
   arguments  is finite).  Theoretically, boolean  equality  should not  be generated  for
   functional types in most cases.

   So, we  have something  special to generate  only for  defined types. The  principle of
   comparison is as follows for 'x' and 'y' to be equal:

     1. 'x' and 'y' must belong to the same alternative,
     2. if so, the components must be equal two by two.

   Of  course, the second  condition shows  that this  is a  recursive process.  Hence, an
   'equality  code'  is   generated  for  each  defined  type.   This  code  uses  special
   instructions, which is quite stupid,  because ordinary instructions would have done the
   job (maybe somewhat more slowly).


      (false_jmp . <addr>)
      (true_jmp . <addr>)

   Put either 'false' (i.e. 0) or 'true' (i.e. 1) in 'R' and jump (unconditionally) to the
   operand address.


      (jmp_false . <addr>)

   Jump to address if the content of 'R' is 'false'.


      (jmp_eq_stack . <addr>)

   This instruction compares the two words on top of stack, puts 1 (true) in R if they are
   equal and performs a jump to 'addr'.  Otherwise the next instruction is executed, and R
   remains empty.


      (jmp_neq_indexes_large . <addr>)
      (jmp_neq_indexes_mixed <bit width> <mask> . <addr>)


      (jmp_neq_string . <addr>)
      (jmp_neq_byte_array . <addr>)
      (jmp_neq 0 . <addr>)
      (jmp_neq 1 . <addr>)
      (jmp_neq 2 . <addr>)
      (jmp_neq 4 . <addr>)


      (eq . <depth>)

   This instruction test the equality of the two words at depths '<depth>' and '<depth>+1'
   in the stack.   The register R is set to  1 (true) if they are equal,  and to 0 (false)
   otherwise.


      eq_string
      eq_byte_array

   Tests the equality of the two strings or  byte arrays on top of stack (depths 0 and 1),
   and  puts either 'true'  (if equal)  or 'false'  (if non  equal) in  R. Of  course, the
   strings or byte arrays are walked into.


      push_eq_data


      (increment_eq . <byte width>)


            *** (2.3.5.6) Dynamic variables instructions.

      (get_var_handler . <addr>)
      (get_mvar_handler . <addr>)
      get_var_monitors
      get_mvar_monitors
      push_mvar_length
      remove_monitor
      ret_if_zero
      get_vv
      xchg_vv
      get_mvv
      xchg_mvv
      create_var
      free_var_seg
      free_mvar_seg


            *** (2.3.5.7) Serialising/unserialising.

      (serialize . <addr>)
      (unserialize . <addr>)


            *** (2.3.5.8) Starting a new virtual machine.

      (start <depth> . <addr>)

   This  instruction is  generated by  the  keyword 'delegate'  and starts  a new  virtual
   machine. '<depth>' is the number of slots on  top of the stack of the old machine which
   should be  copied into  the stack  of the new  machine. No  virtual copies  are needed,
   because they are already performed  (by 'copy_stack_...' instructions). The new machine
   starts execution at the instruction following this one, whilst the old machine jumps to
   the operand address.


            *** (2.3.5.8) Obsolete and soon obsolete instructions.

   The following are still used but should become obsolete (used for global variables):

      (create_vars ...)
      (gv_address ...)
      (init_gv ...)
      (del_gv ...)
      get_gvv
      xchg_gvv
      initialization_address
      variables_deletion_address


   The following have been never used (experimental ?):

      (call ...)
      location
      (code_for ...)
      (type_list ...)
      push_before_eq
      (dec3 . <addr>)


      *** (2.4) Unification.

   Since the Anubis language allows  the use of type parameters (parametric polymorphism),
   we need to be able to unify types.

   'Environments' are lists of the form:

                                  ((v_1 . T_1) ... (v_k . T_k))

   where  each 'v_i' is  an 'unknown',  and where  each 'T_i'  is a  type (which  may also
   contain unknowns). Such an environment just says that:

           v_1 has value T_1
           ...
           v_k has value T_k

   If a  pair is of the  form '(v_i .  v_j)', i.e.  if the  value of 'v_i' is  the unknown
   'v_j', we assume that 'v_i > v_j'  (recall that they are 'Expr's, hence integers). This
   is to avoid some kind of circularity.

   Because  of the Lisp-like  representation, unification  is fairly  simple. It  works as
   follows,  where  'u(A,B)/E' is  the  result  of unifying  'A'  with  'B' relatively  to
   environment 'E'. The result is a new environment. '$1' and '$2' represent two arbitrary
   unknowns.


      u($1,$1)/E   = E
      u($1,$2)/E   = (($1 . $2) . E) (if $1 > $2)
      u($1,$2)/E   = (($2 . $1) . E)     (otherwise)
      u($1,T)/E                  = (($1 . T) . E)
      u(A,A)/E                   = E                   (for any atomic A)
                                     | not_unifiable   if u(A1,B1)/E = not_unifiable
      u((A1 . B1),(A2 . B2))/E   =   |
                                     | u(B1,B2)/(u(A1,A2)/E)  otherwise
      u(A,B)/E                   = not_unifiable    in all other situations


   We have no problem  with the fake 'Expr' <Cint> because we  unify only types, and types
   never contain a <Cint>.

   It may  be the case that  the two expressions to  be unified are  relative to different
   environments. In that  case we just have to merge the  two environments before unifying
   the two expression.

   We also  want to avoid  circularities, and to  that end we  use a non  circularity test
   (like  some Prolog  systems).  The  problem is  that  the above  rules  would give  the
   following for example:

          u($1,Maybe($1))/E = (($1 . Maybe($1)) . E)

   But this is not correct because, a type 'T' cannot be equal to 'Maybe(T)'.  Hence, when
   we have to compute 'u($1,T)/E', we use the function 'depends_on' to test if 'T' depends
   on '$1' directly or indirectly. If it is the case, the unification fails.

   Important note: Unification is used each time  we have to compare types. Indeed, a type
   with unknowns  is a non  already completely known  type. The comparison may  succeed if
   some  unknowns are  instanciated.   This is  why  comparing types  without unifying  is
   generally an error.

   Also,  all the  computations of  unification and  interpretations of  terms  are purely
   deterministic  (no side  effect  at all).  This  warranties that  when  we try  several
   possibilities, there is no clash between them.

   It  is  often  the  case  that  we  want  to  dereference  a  type  relatively  to  its
   environment. This means replacing (recursively) all types unknows by their values (when
   they have one). This is performed by the command 'dereference_type'.


      *** (2.5) Organization of the knowledge base.

   The compiler stores the result of checking the paragraphs into several arrays:

    struct Type_struct        *types;      (array of type definitions)
    struct Operation_struct   *operations; (array of definitions, formerly called 'operations')
    struct Variable_struct    *variables;  (array of global variables (obsolete))

   See the definitions of these structures in 'compil.h'.

   Checked paragraphs  are stored  into these  arrays, which work  like stacks.  The stack
   pointers are called 'next_type' and 'next_operation'.


      *** (2.4) Indexation of symbols.

   In a  previous version of  Anubis, when the  interpretations of a symbol  were searched
   for, the array of types or of  operations were looked up sequentially.  During the year
   2003 (as far  as I can remember), it  appeared that this was too  slow (especially with
   Olivier's  program, containing thousands  of types  and symbols).   I decided  to index
   these arrays  by a hashing method.  The result was  that the compiler became  two times
   faster.

   The 'symbol index' is just an array  of 256 arrays of 'SymbolIndexEntry'. When a symbol
   is  searched for,  its 'hash'  is first  computed.   This hash  is an  U8 (computed  by
   'u8_symbol_hash' in 'interp.c').  With this hash  we may go directly to the right array
   of 'SymbolIndexEntry'.

   This secondary array  is searched for an entry corresponding to  our symbol (each entry
   contains the name  of a symbol). When such an  entry is found, we get  the index of the
   actual definition in the array of types or operations.

   In the future this system should be replaced by a B-tree, indexed on names of symbols.


   *** (3) The lexer.


      *** (3.1) Generalities.

   The role of  the lexer is essentially to  cut the input into 'tokens'.   'lexer.y' is a
   FLEX file,  which must be compiled by  'flex'. This produces the  file 'lex.yy.c' which
   becomes part of the C source.

   However, the lexer performs several other tasks:

     - handling 'read' and 'replaced by' paragraphs, computing visibilities,

     - counting lines and columns, and level of nested comments,

     - working on several special tokens (strings, floats, named arrows,...).


   The lexer has several 'states' (in the sens of FLEX). They are the following:

     - INITIAL (used when reading comments outside paragraphs)
     - COM (used for reading /* ... */ comments)
     - PAR (used for reading paragraphs)
     - INCL (used for reading file names after either 'read' of 'replaced by')
     - STR (used for reading strings)
     - STRTL (used when a string is too long)
     - CONF (used for reading the configuration file)
     - CONFCOM (used for reading comments within the configuration file)

   The names  of all the  tokens returned by  the lexer are  prefixed by 'yy__'.  They are
   declared in 'grammar.y'.


      *** (3.2) Finding source files.

   The compiler must identify files clearly, not only by the name, because different files
   in different directories  may have the same  name. Notice that the path  which is given
   after the keyword 'read'  is not enough for identifying the file.  This is because this
   path is itself relative.

   The compiler  keeps in memory the absolute  path of a 'current'  directory. Notice that
   this current directory  is a notion internal to  Anubis. It has nothing to  do with the
   current  directory in  the  sens of  the  host operating  system.   Anyway, the  Anubis
   compiler never changes the current directory in the sens of the host system.

   When  the compiler  starts, the  'current' directory  is the  directory from  which the
   Anubis compiler has been launched.

   When the  compiler needs to find  the source file  corresponding to a path,  either the
   path given  on the command line,  or a path found  after a keyword  'read' or 'replaced
   by', it looks for the file using the following absolute paths successively:

      <current directory>/<given path>
      <my_anubis directory>/<given path>
      <anubis directory>/<given path>

   except if <given path> is absolute. In this case only this path is used.

   When the  file has been  found, the  directory of this  file becomes the  new 'current'
   directory.


      *** (3.3) The 'read' stack.

   (also called 'include stack' in the source of the compiler).

   Of course,  when a 'read'  or 'replaced by'  keyword is encountered, the  compiler must
   read another file,  but must be able to  come back for reading the rest  of the current
   file. This is why it needs a stack.

   Each 'frame' in this stack contains the following informations:

     - a datum of type YY_BUFFER_STATE, needed by FLEX/BISON for each file,
     - the absolute path of the file currently read
     - the current line number in the file currently read,
     - the absolute path of the 'current' directory for the file currently read.

   Note: at the time  of writting this, this stack is actually  made of 4 distinct stacks,
   one for each item. These stacks are growing in parallel, and hence share a unique stack
   pointer. It would be a good idea to replace this by a single stack whose slots would be
   a C structure.

   When the end of a file is reached, a  frame is poped from the stack, and the reading of
   the 'calling' file may be continued.

   At the time of writting this text, this stack may contain at most 100 frames.


      *** (3.4) Remembering already read files.

   The compiler remembers which files have been read. Indeed, during the same compilation,
   it may encounter several times a request for reading the same file. Of course, the file
   is read only once, and the compiler has just to adjust visibility.

   The  absolute   paths  of   the  files   already  read  are   stored  into   the  array
   'already_included'  (of at most  'max_already_included' =  1024 paths,  at the  time of
   writting this text).


      *** (3.5) Computing visibilities.

   A file B is visible from a file A only if the file A contains one of the paragraphs:

     read B
     replaced by B

   Normally,  a file  containing a  'replaced  by' paragraph  does not  contain any  other
   paragraph (but it  may contain off paragraph  comments). This kind of file  is called a
   'relay' file, and is used when the location of a file of the distribution is changed.

   'read' is not transitive. In other words, if we have

     read B

   in file A, and

     read C

   in file B, this does not make file C visible from file A. However, if we have:

     read B

   in file A and

     replaced by C

   in file B, C is visible from file A.

   The visibility informations  are stored into several arrays. First  of all, notice that
   each file  is identified  by an 'id'  which is the  position of  its path in  the array
   'already_included'. Notice that 'predefined.anubis' (which is 'pre-read') has id 0.

     - the array 'number_of_visible'  gives for each file id, the number  (of type U16) of
     files which are visible from this one.

     - the array  'is_relay_file' is an array of  booleans (actually an array  of U8). For
     each file it indicates if the file is a relay file or not.

     - the  two  dimensional  array  'visibility'.   This  array  provides  some  kind  of
     'visibility stack' for each file id. For example, if the number of files visible from
     the file whose  file id is idA is 3, and  if the ids of these files  are idA, idB and
     idC, we have:

          visibility[idA][0] = idA
          visibility[idA][1] = idB
          visibility[idA][2] = idC

   Notice that since a file is always visible from itself, the first entry into this stack
   always contains the id of the file to which this stack belongs.

   When file  B becomes visible from  file A, ibB is  pushed into the  visibility stack of
   file A  (if not already  present).  Such stacks  are never poped  off, because if  B is
   visible from A, it remains visible  indefinitely. Notice that such stacks never need to
   be   enlarged,   because  the   number   of  files   read   cannot   be  greater   than
   'max_already_included', which is the size of these stacks.

   So, the question is just to know when a file becomes visible from another one. First of
   all, each file must be visible from itself. Hence this is registered the first time the
   file is  encountered. As a  particular case, 'predefined.anubis' is  encountered before
   any other one. Furthermore, it must be visible from all other files.

   When a relay  file becomes visible from  some file, it must transmit  its visibility to
   the file it is pointing to.


   *** (4) The parser.

   The parser  (grammar.y) is a  BISON file. It  provides the function 'yy_parse'  used in
   'main.cpp'. It contains essentially the following parts:

     - Declaration of tokens and non terminals with their types.
     - Precedence and association rules.
     - Grammar rules.

   The  rules of precedence  and association  should not  be changed  (and their  order is
   important) because  if they are changed,  files which are currently  compiling ok could
   have syntax errors.

   Each  grammar  rule has  an  'action'  part (between  {  and  })  which constructs  the
   syntactical tree  corresponding to the rule.  These actions are the  official source of
   informations for the Lisp-like format of terms.


   *** (5) Computing interpretations of terms.


      *** (5.1) Overview.

   The Anubis language  allows overloading of symbols and  type parameters. These features
   introduce many ambiguities which must be resolved.

   A  term may  have several  distinct interpretations.  For example,  when you  write the
   number:

                                          2

   it has  at least 3 interpretations, one  as an 'Int8', one  as an 'Int32' and  one as a
   'Float'.  Similarly,  a  symbol  or  a  binary operator  like  '+',  may  have  several
   definitions, hence several interpretations.

   When atomic terms (like numbers and symbols) are combined together for forming compound
   terms,  like applicative  terms, conditionals,...   the compound  term may  itself have
   several  interpretation depending  on which  interpretation  we choose  for the  atomic
   subterms of this term.

   Hence, what we need is a (necessarily recursive) function for computing the list of all
   interpretations  of a  term. This  function  is called  'term_interpretations' and  has
   several auxiliary functions. All these functions are defined in 'interp.c'.

   Also terms must  be interpreted relative to  a 'context'. A context is  normally just a
   set  of declarations.   But in  the case  of  the Anubis  compiler, a  context is  more
   precisely an accurate representation of what will be the top of stack at run time. Such
   a context  is represented  as a list,  with the  head of list  being the  most recently
   pushed item.

   For example, if we want to interpret a term like this one in some context C:

                                     with x = a, t

   we must  first get  the list of  all interpretations  of 'a' relative  to C.   For each
   interpretation of 'a', we  may compute the list of interpretations of  't', but it must
   be relative to '((x . T) . C)',  where 'T' is the type found for this interpretation of
   'a', because after  'a' is computed (at run  time) its value is pushed  into the stack.
   (Actually, in  the case of this  example, the compiler  requires that 'a' has  only one
   interpretation. This  is for making the source  files more explicit and  easier to read
   for humans.)


      *** (5.2) Interpreting applicative terms.

   An applicative term  is just made of  a function applied to arguments.  For example, it
   may be:

                                         f(a,b,c)

   (or 'a + b'). Such a term is represented internally as:

                               (app <lc> [f] . ([a] [b] [c]))

   where <lc> is a Lisp integer ('Expr')  from which the name of the source file, together
   with the line and the column of the term may be recovered.

   In order to interpret this term, the compiler must first compute all interpretations of
   the  function  itself.  It  also  computes  all the  interpretations  of  the tuple  of
   arguments. This may lead to a very big number of interpretations.  For example, if 'a',
   'b'  and   'c'  have  respectively  2,   4  and  7  interpretations,   this  yields  56
   interpretations for  the tuple '(a,b,c)'. If  the number of interpretations  of a tuple
   exceeds say  one million, the compiler may  become very slow. The  remedy (comming from
   the  user) is  to type  explicitly each  argument  in order  to reduce  this number  of
   interpretations.

   I've tried  to proceed another way. Since  for each interpretation of  the function, we
   get types for  the arguments, I've tried to recompute the  interpretations of the typed
   arguments in each  case. But experimentations showed that this is  even slower. For the
   future, we need  so more clever mecanism.  This problem seems to be  similar to logical
   problems encountered in optimisations of SQL queries.

   Now, for each  interpretation of the function and, each interpretation  of the tuple of
   arguments, we must unify the signature of the function (type of the tuple of arguments)
   with  the type  of the  actual tuple  of arguments.   If this  unification  fails, this
   combination is  impossible. This  is how the  number of  interpretations may stay  at a
   reasonable level.


      *** (5.3) Interpreting conditionals.

   The test is first interpreted. The compiler requires (again for making the source files
   more  explicit and  easier to  read for  humans) that  the test  has one  and  only one
   interpretation. For each interpretation of the test,  the type of the test is known. Of
   course  all   interpretations  whose  type  is   not  a  type   with  alternatives  are
   rejected. However,  components in this type  may still be unknown,  the essential being
   that alternatives are known.

   Next, the compiler  analyses the cases. We  must have one case per  alternative, and in
   the same order.

   For each alternative, the compiler checks that the  name of the case is the same as the
   name of the  alternative, and that the number  of resurgent symbols is the  same as the
   number of  components. If a resurgent symbol  is explicitly typed, its  type is unified
   with  the type  of the  coresponding  component. If  this unification  fails, an  error
   message is sent.

   For each case, the  compiler constructs a new context for interpreting  the body of the
   case. Indeed,  resurgent symbols are declarations  which enrich the  context.  This may
   yield several interpretations  for each body of case. The  compiler constructs the list
   of all  interpretations of the tuple of  bodies. Again (as for  applicative terms) this
   may lead to a big number of  interpretations and slow down the compiler or even exhaust
   the memory.

   For  each such  interpretation of  the tuple  of bodies,  the types  of the  bodies are
   unified. This  is because  in a  conditional, the bodies  must all  have the  same type
   (which will be the  type of the conditional). When the unification  fails, the tuple is
   rejected. This leads in general to  a reasonable number of interpretations of the tuple
   of bodies. Here we could easily be more clever by unifying interpertations of bodies as
   soon as they are computed, i.e. unifying the types of the first and second bodies, then
   the type of the third body with the type of the first two and so on.


      *** (5.4) Interpreting selective conditionals.

   The selected  alternative is found  using the name  of the head  of case. If  there are
   several  alternatives with  this  name, the  compiler  checks the  number of  resurgent
   symbols.   If there are  several alternatives  with the  same name  and same  number of
   resurgent symbols,  the compiler  uses the declarations  (if any)  of the types  of the
   resurgent  symbols. These  types  are unified  with  the actual  types  taken from  the
   definition of the type of the test.

   Of course, if  there is an 'else' case, the  type of this case must be  the same as the
   type of the body of the selected case. This is checked by unification.


   *** (6) Implementation of types.

   Notice that we implement only 'actual types', not type schemes (for the time being; see
   the discussion  on quantified  types). An  actual type is  a type  with no  unknown nor
   parameter.

   Types in Anubis 1 are of the following sorts:

     - primitive types: String, Int32, Listener, ...

     - defined types: types which are defined by a paragraph, even if this paragraph is in
       predef.c or predefined.anubis.

     - functional types: (T_1,...,T_k) -> U

     - address types: RAddr(T), WAddr(T), RWAddr(T), GAddr(T) (obsolete), Var(T), MVar(T).

     - structure  pointer types:  These  types  are used  for  encapsulating C  structures
       comming from linked libraries (like SSL or SQLite3).


      *** (6.1) Implementation of primitive types.

   There is a particular implementation for each primitive type.


         *** (6.1.1) 'String'.

   A string in Anubis 1 is implemented as a pointer:

      *
      |
      V
      +-------+-------------------------------+---+
      |  cnt  | a   b   c   d  ....           | 0 |
      +-------+-------------------------------+---+
      0       4

   The first 4 bytes are the counter  for the garbage collector. Notice that strings which
   are defined statically  in the source files, have  a 0 at this place,  meaning that the
   string is permanent (never collected).

   Starting at offset 4,  are the characters of the string.  A trailing  0 is added at the
   end, like in C.


         *** (6.1.2) 'ByteArray'.

   A byte array is a pointer as follows:

      *
      |
      V
      +-------+-------+-------------------------------+
      |  cnt  |   n   |  a   b   c   d  ....          |
      +-------+-------+-------------------------------+
      0       4       8

   where again 'cnt'  is the counter, and 'n'  the number of bytes in the  byte array (not
   including  'cnt' and  'n'). The  bytes are  stored starting  at offset  8. There  is no
   trailing 0.


         *** (6.1.3) 'Int32'.

   An 'Int32' is just a 'int' of C (of  32 bits of course), i.e. a 32 bits word containing
   the number itself.


         *** (6.1.4) 'Float'.

   Because data in Anubis 1 cannot have more than 32 bits, double precision floating point
   numbers are implemented indirectly (which is very bad for performances):

      *
      |
      V
      +--------+-----------------+
      |  cnt   | IEEE754 number  |
      +--------+-----------------+
      0        4                 12

   where again 'cnt' is the counter for the garbage collector.


         *** (6.1.5) 'Listener'.

   Despite the  fact that  'Listener' is a  primitive type,  it is implemented  exactly as
   network address types. See below.


      *** (6.2) Implementation of defined types.

   Defined types  (types with alternatives, defined  by paragraphs beginning  by 'type' or
   'public type') are  implemented according to 3 different  methods: 'small', 'mixed' and
   'large'.

   First of  all, the  width of  the type  is computed. The  width is  the number  of bits
   required for accomodating all data of this type. It is computed as follows.

   Some kind of base  2 logarithm of the number of alternatives  is computed.  This is the
   smallest number of bits required  for accomodating all alternative numbers (also called
   'indexes'). It is:

      0         for at most 1 alternative       (example: 'Empty' or 'One')
      1         for 2 alternatives              (example: 'Bit' or 'Bool')
      2         for 3 to 4 alternatives
      3         for 5 to 8 alternatives
      4         for 9 to 16 alternatives
      etc...

   Since the maximal number of alternatives is limited to 256, the width of an index field
   is at most 8.

   This logarithm of the number of alternatives is called the 'index width' for this type.

   Now, each alternative has its own width. It  is the sum of the index width of the type,
   and  the  widths  of  all  components  of  the  alternative.   For  example,  the  type
   'Maybe(Int8)' has 2 alternatives:

        failure
        success(Int8)

   the index width is 1 (2 alternatives).  The  width of the first alternative is 1 (= 1 +
   0; no  component), and  the width of  the second alternative  is 9  (= 1 +  8), because
   'Int8' is of width 8 (as expected).

   Now, the  width of the  type is the  supremum of the width  of its alternatives  (0 for
   empty types). So, it is 9 for 'Maybe(Int8)' for example.

   These computations are performed in 'typewidth.c'.

   Notice that  this width is  not well defined  in the case of  a recursive type.   For a
   recursive type (or a cross recursive type), the width is infinite (which corresponds to
   the fact that the type itself is infinite (if it is not empty)).  This is why we do not
   compute the width of recursive types.  Hence, we need a tool for detecting if a type is
   recursive  (maybe  cross recursive  through  other types).   This  tool  is defined  in
   'rectype.c'.  Notice  that a  recursive type which  is empty  (see may happen:  see the
   discussion on  'NonEmptyList' in 'predefined.anubis') is still  considered as recursive
   by Anubis 1, and its width is still infinite.

   The width of a type is used to  determine the number of bits needed for holding a datum
   of this type.  Above a certain limit,  and in particular for recursive  types, the type
   must be implemented indirectly, i.e. through a pointer to a segment.

   The width discussed above will be called the 'theoretical width'.

   Hence, we have another notion: that of 'real width'. The real width, for Anubis 1 is an
   integer between 0 and 33 included.  The value 33 means that the theoretical width is at
   least 33, and that  the type will be implemented indirectly.  Values  from 0 to 32 mean
   that the type will be implemented directly within that number of bits.

   Of course, allowing 64  bits or 128 bits direct data in  the future, means updating all
   these programs.

   If the real width of a type is at most 32, the type is 'small'. It cannot be recursive,
   and all its components, and the components of its components, and so on, are small. Any
   datum of this type may be represented on 'width' bits.

   If the real width is 33, there are two cases:

     - the number of alternatives is at most 4: the type is 'mixed'.
     - the number of alternatives is 5 or more: the type is 'large'.

   If  a type is  large, all  data of  this type  are represented  indirectly, i.e.   as a
   pointer to a memory segment:

      *
      |
      V
      +----------+---+--------------+--------+---------------+
      |   cnt    | i |     c1       |   c2   |      c3       |
      +----------+---+--------------+--------+---------------+
      0          4   5

   The first  4 bytes of  the segment are  always occupied by  the counter ('cnt')  of the
   garbage  collector.  The next  byte (at  offset 4)  is the  alternative index  for this
   datum. It indicates  to which alternative the datum belongs. Starting  at offset 5, are
   the representations of the components of the datum.

   Since all pointers returned  by the memory allocator are multiples of  4 (the two least
   significant bits  are always  0), these two  bits are  available for any  purpose. I've
   decided to use them  for storing the alternative index (this is  why this method cannot
   be applied if there are more than 4 alternatives). This has the advantage of avoiding a
   pointer to  a segment for  some alteratives. For  example, the type 'List(T)'  (for any
   type 'T') is  mixed. The first alternative  has width 1, the second  alternative has an
   infinite width. Hence, it is tempting to  avoid the use of an indirection for the first
   alternative. This is what the 'mixed' implementation realizes.

   Hence, the empty list (for any type 'T') is  just the 32 bits word 0, while a non empty
   list looks like this:

        pppppppp pppppppp pppppppp pppppp01

   where  the  least significant  two  bits (01)  indicate  alternative  number 1  (second
   alternative), and where:

        pppppppp pppppppp pppppppp pppppp00

   is the pointer:

      *
      |
      V
      +----------+-------+---------------+
      |   cnt    | head  |     tail      |
      +----------+-------+---------------+
      0          4

   Notice that  there is no  alternative number at  offset 4 (since  it is already  in the
   least significant bits of the datum itself), and that components are stored starting at
   offset 4. 'tail' always occupies 4 bytes, but 'head' may occupy 0, 1, 2 or 4 bytes.


      *** (6.3) Implementation of functional types.

   Functions are implemented in two different ways depending on how they are defined.

     1. If the function is  defined by a paragraph, it is just  represented by the address
        of its code.

     2. If the function is defined by the arrow '|->' (or by the recursive arrow '|-f->'),
        it is represented as a 'closure'. A closure is a pointer to a segment like this:

      *
      |
      V
      +----------+----------+----------+----------+----------+----------+...
      |   cnt    |  code    |   del    |   x1     |   x2     |   x3     |
      +----------+----------+----------+----------+----------+----------+... etc
      0          4          8          12         16         20         24

   where 'cnt' is  the counter of the garbage  collector as usual. All slots  in a closure
   occupy 4 bytes. The  slot 'code' is the address of the code  of the function. The slots
   'x1', 'x2', etc... contain the values of the symbols which where present in the context
   at the time the function (i.e. the  closure) was created, and which are refered to from
   within the  body of the function  definition.  This part  of the closure is  called the
   'micro context'. The slot 'del' contains the address of the garbage collector code able
   to properly destroy the closure itself.

   When we have a fonction at hand (i.e. a 32 bits word representing a datum whose type is
   functional), we must be able to decide  if the function is implemented as a simple code
   address or as  a closure. Again we take  advantage of the fact that  pointers to memory
   segments are divisible  by 4. If the pointer  is even (least significant bit  = 0), the
   function is  a closure. If it is  odd (least significant bit  = 1) it is  a simple code
   address. For this reason  the code of a function defined at  top level (by a paragraph)
   is always  positioned at an odd  address.  This is the  reason of the  existence of the
   instruction 'odd_align' (which is of course of size 1 and never executed).


      *** (6.4) Implementation of address types.

   There are two sorts of address types:

      1. File and network address types: 'RAddr(T)', 'WAddr(T)', 'RWAddr(T)', 'Listener'
      2. Dynamic variable address types: 'Var(T)', 'MVar(T)'.

   We ignore GAddr(T), which is obsolete.


         *** (6.4.1) File and network address types.

   A datum of such a type is a pointer:

      *
      |
      V
      +----------+---+---+---+---+---------------------------------+
      |   cnt    | t | m | _ | _ |           12 bytes              |
      +----------+---+---+---+---+---------------------------------+
      0          4   5   6   7   8                                 20

   As usual the first 4 bytes are for  the counter of the garbage collector. The next four
   bytes are:

       t      type indicator
       m      mode indicator
       _      not used
       _      not used

   The next  12 bytes starting  at offset 8  are used depending on  the value of  the type
   indicator.

   The type indicator may take the following values ('conn' is for 'connection', a synonym
   (!?)  of 'address'):

      conn_invalid             invalid connection (should never happen)
      conn_file                file on disk
      conn_listener            listening TCP/IP connection (server: type 'Listener')
      conn_network             client TCP/IP connection
      conn_ssl_network         client SSL connection
      conn_local               (obsolete: local variable)
      conn_global              (obsolete: global variable)
      conn_stdin               standard input
      conn_stdout              standard output
      conn_stderr              standard error output

   The values of the above are defined in 'vm.h'.

   The mode is an OR of the following flags:

      conn_read             readable connection
      conn_write            writable connection
      conn_read_pw          (experimental: connection with password access)
      conn_write_pw         (idem)
      in_progress           (connection not yet up)

   The 12 bytes after offset 8 are used as follows:


   Files (conn_file):

               +---+---+---+---+---+---+---+---+---+---+---+---+
               |       *       |   |   |   |   |   |   |   |   |
               +---+---|---+---+---+---+---+---+---+---+---+---+
                       V
                       FILE (in the sens of C)

   Network (conn_network, conn_listener):

               +---+---+---+---+---+---+---+---+---+---+---+---+
               | socket handle |   |   |   |   |   |   |   |   |
               +---+---+---+---+---+---+---+---+---+---+---+---+

   SSL connection (conn_ssl_network):

               +---+---+---+---+---+---+---+---+---+---+---+---+
               |  (SSL *)      | TCP/IP handle | (char *)      |
               +---+---|---+---+---+---+---+---+---+---|---+---+
                       V                               V
                       SSL struct of OpenSLL           server common name

   Standard connections (conn_stdin, conn_stdout, conn_stderr):

               +---+---+---+---+---+---+---+---+---+---+---+---+
               |   |   |   |   |   |   |   |   |   |   |   |   |   (not used)
               +---+---+---+---+---+---+---+---+---+---+---+---+


         *** (6.4.2) Dynamic variable address types.

   A simple dynamic variables (type 'Var(T)') is a pointer:

     *
     |
     V
     +-----------+-----------+-----------+-----------+
     |   cnt     |  value    |     n     |     *     |
     +-----------+-----------+-----------+-----|-----+
     0           4           8           12    |     16
                                               V
                                               monitors table

   The segment contains 4 32 bits words:

      - at offset 0: the counter
      - at offset 4: the current value of the variable
      - at offset 8: size of monitors table (number of slots)
      - at offset 12: pointer to monitors table

   'n' should be 0  exactly when the pointer to monitors table  is NULL. When the variable
   is created (by instruction  'create_var'), n is 0 and the pointer  to monitors table is
   NULL.

   A monitor is just a datum of type 'One -> One' (most often it is a closure).


   Multiple dynamic variables (type 'MVar(T)'). This is again a pointer:

     *
     |
     V
     +----------+----------+----------+----------+----------+-----------------------
     |    cnt   |    ns    |    nm    |    *     |   nbw    |       data ...
     +----------+----------+----------+----|-----+----------+-----------------------
     0          4          8          12   V     16         20
                                           monitors table

   In this segement, we have:

      - at offset 0:    garbage collector counter
      - at offset 4:    number of slots for data in the variable
      - at offset 8:    number of slots for monitors
      - at offset 12:   pointer to the table of monitors
      - at offset 16:   normalized bit width (see below)
      - at offset 20:   data

   A multiple dynamic  variable is nothing else  than a 'mutable array'. This  array has a
   size (number of data of type T that we can put in the array).

   Normally, we should do in such a way that slots for data has just the size required for
   type T.  For  example, if we have  a multiple variable of type  'MVar(Bool)', each slot
   should  occupy just  1  bit, not  32  bits. If  we  have a  multiple  variable of  type
   'MVar(Maybe(Int8))', each data slot should occupy 9 bits.

   Because  of byte  boundaries, we  use only  powers of  2 for  the number  of bits  of a
   slot. Hence, for example, in a multiple variable of type 'MVar(Maybe(Int8))', each data
   slot will be  of 16 bits, the least number of  bits which is a power of  2 and at least
   equal to 9.

   This numner of bits is called the 'normalized bit width'. It may be computed as follows
   in C:

     if (bw <= 0)   nbw = 0;  else
     if (bw <= 1)   nbw = 1;  else
     if (bw <= 2)   nbw = 2;  else
     if (bw <= 4)   nbw = 4;  else
     if (bw <= 8)   nbw = 8;  else
     if (bw <= 16)  nbw = 16; else
     nbw = 32;

   where 'bw' is the real bit width, and 'nbw' the normalized bit width.

   However, for  the time being, Anubis  1 sets nbw =  32. This should be  worked out, and
   maybe, we should also consider 24 as an acceptable normalized bit width.


      *** (6.5) implementation of C structure pointer types.

   A datum of type '£StructPtr(Name)' is as follows:

      *
      |
      V
      +--------+--------+
      |  cnt   |    *   |
      +--------+----|---+
      0        4    |   8
                    V
                    C structure

   i.e.   it is  a pointer  to a  segment of  8 bytes  containing only  a counter  for the
   garbage collector, and a pointer  to the actual C structure of name  'Name'. Names of C
   structures are declared in 'bytecode.h'.


   *** (7) Code generation.

   The   function    'compile_term'   (in   'compile.c')   generates    byte   code   from
   interpretations. Below, we describe how this code is generated.


      *** (7.1) General conventions.

   Code generation is a recursive  process (just because interpretations themselves are of
   a  recursive nature).  The  code resulting  from the  compilation of  an interpretation
   should behave always  the same way with respect  to the stack and the  register R.  Our
   conventions are the following.

   First of all, consider  an interpretation '(head . env)'. We may  of course replace all
   occurences  of a type  unknown in  'head' by  its value  as given  by 'env'.  Hence, we
   consider only interpretation heads, and assume that they contain no unknowns.

   An  interpretation head  must be  compiled relative  to a  context. This  context  is a
   precise description of what the stack will contain at run time. We represent context by
   lists, the head of list being the top of stack. Each element in the list corresponds to
   a slot in the stack. Furthermore, the stack  in Anubis 1 is homogeneous. Each slot is a
   32 bits word, able to contain a datum of any type.

   The Exprs which are listed in a context are the following:

      (<symbol> . <type>)

   This  means that  this slot  contains a  datum of  type <type>,  and that  its  name is
   <symbol>.

      (ret . <arity>)

   This means  that this  slot contains  a return address,  corresponding to  a call  to a
   function of <arity> arguments.

      (f_micro_ctxt <fname> <ftype> (<symbol> . <type>) ...)

   This slot contains a 'closure' (i.e.   function defined by '|->').  '<fname>' is either
   'nil' (in the case of '|->'), or "f" (in the case of '|-f->'). '<ftype>' is the type of
   the  function. The  list  of  '(<symbol> .  <type>)  is the  micro  context proper.  It
   indicates in which order  the data remembered by the function are  placed in the memory
   segment of the closure.

      (argument . <type>)

   This means that  the slot is occupied by  a datum which has been pushed  into the stack
   for being  an argument to  a function call.  This datum is  anonymous, but its  type is
   known.

   For example, you may see below how the compiler prints the context:

        (
          (argument . "SSL_Connection")
          (argument . "Bool")
          (argument . "DenialOfService")
          (ret . ...)
          ("connection" . "SSL_Connection")
          (f_micro_ctxt nil ... ("sites" app_ts "List" "Web_Site_Description")
                                ("dos" . "DenialOfService"))
          (ret . ...)
        )

   into a '.sc' file:

         |       ;--- stack ------------------------
         |       ;   0                            SSL_Connection
         |       ;   1                            Bool
         |       ;   2                            DenialOfService
         |       ;   3                            (return address)
         |       ;   4     connection             SSL_Connection
         |       ;   5.0   sites                  List(Web_Site_Description)
         |       ;   5.1   dos                    DenialOfService
         |       ;   6                            (return address)
         |       ;----------------------------------


   If '<head>' is an interpretation head, and 'Ctxt' is a context, we denote by:

                                        [<head>]Ctxt

   the sequence of instructions resulting  from compiling '<head>' relative to the context
   'Ctxt'.

   Convention  1:  The   stack  after  execution  of  the   sequence  of  instructions  in
   '[<head>]Ctxt' is the same as before this execution.

   Convention 2: After the execution of '[<head>]Ctxt', R contains the value of '<head>'.


      *** (7.2) Elimination of terminal recursion.

   Since the language has no notion of  loop, it is of fundamental importance to eliminate
   terminal recursion. Roughly speaking, this amounts to replace the sequence:

     (apply . k)
     (ret . 0)

   by:

     (collapse . k)
     (apply . k)

   Indeed, when  the first sequence is  on the point to  be executed, the  situation is as
   follows:

       stack                       R
       --------------------------------------------
       a_1 ... a_k r s ...         f

   where 'a_1,...,a_k' are  the arguments of the function, 'r' the  return address and 'f'
   the function itself. 's' is the return address of the previous call.

   '(apply . k)' will perform a jump to the code of the function (found in 'R'), and after
   the return from  this function call, 'a_1 ...  a_k r' have been removed  from the stack
   which becomes:

                     s ...

   Then the instruction '(ret  . 0)' removes 's' from the stack  and jumps to address 's',
   so that the stack becomes:

                       ...

   The other sequence of instructions works  as follows. '(collapse . k)' removes 'r' from
   the stack, which becomes:

         a_1 ... a_k s ...

   Then '(apply . k)' call the function 'f',  and after the return of this call, the stack
   is:

                       ...

   as expected.

   Actually,  what it  does is  just avoid  to  perform the  instruction '(ret  . ?)'  two
   consecutive times (one time with address 'r', and one time with address 's').

   Actually, the  situation is more complex because,  in general we have  the arguments of
   a previous call between 'r' and 's', so that the stack looks like this:

       a_1 ... a_k r b_1 ... b_i s ...

   When we return from the call to 'f', the stack becomes:

                     b_1 ... b_i s ...

   and the next instruction  cannot be '(ret . i)' (except if i ==  0), because the top of
   stack 'b_1' is  not a return address.  In this situation,  terminal recursion can still
   be eliminated.  If the recursion is terminal, the next instructions are:

      (del_stack ...)
      ...
      (del_stack ...)
      (ret . i)

   (as many 'del_stack' or 'collapse' as they  are arguments, i.e. 'i'). At that point the
   machine deletes the arguments from the stack and performs the 'ret'.

   So the general situation is the following sequence of instructions:

      (apply . k)
      (del_stack_? . <depth>)
      (collapse . <depth>)
      ...
      (ret . i)

   and the stack is the following:

       stack                                   R
       --------------------------------------------
       a_1 ... a_k r b_1 ... b_i s ...         f

   We replace this sequence by the following one:

      (collapse . k)
      (del_stack_? . k+<depth>)
      (collapse . k+<depth>)
      ...
      (apply . k)

   Here is how it works:

      Instructions                              Stack
      ----------------------------------------------------------------------------------
                                                a_1 ... a_k r b_1 b_2 b_3 ... b_i s ...
      (collapse . k)                              a_1 ... a_k b_1 b_2 b_3 ... b_i s ...
      (del_stack_? . k+<depth>)                       a_1 ... a_k b_2 b_3 ... b_i s ...
      (collapse . k+<depth>)                              a_1 ... a_k b_3 ... b_i s ...
      ...
                                                                      a_1 ... a_k s ...
      (apply . k)                                                                   ...

   The main  difference with  the original  code is just  that 'b_1  ... b_i'  are deleted
   before the call to 'f'.


      *** (7.3) Generating the code.


         *** (7.3.1) Protect.

   The interpretation head has the following form:

      (protect <lc> . <head>)

   The code generated is:

       [(protect <lc> . <head>)]Ctxt =

         label a:
            protect
            [<head>]Ctxt
            unprotect a
            <end code>

   The 'protect' instruction  is of size 2  (2 bytes).  The first byte  is the instruction
   name 'protect', and the  second byte is the 'lock'. This byte is  zero when the code is
   generated.

   At run time, the instruction 'protect' checks the  value of the lock (which may be 0 or
   1). If it is  1, the subsequent code is locked, and the instruction  gives up. If it is
   0, the piece of code  is not locked. So the instruction puts 1 in  the lock so that the
   subsequence code is now locked, and IP (Instruction Pointer) is incremented by 2.

   The subsequent  code '[<head>]Ctxt'  is executed. Then  the instruction  'unprotect' is
   reached. This instruction puts 1 into the  byte at offset 'a+1' (i.e. into the lock) so
   that the code is now unlocked.


         *** (7.3.2) Lock.

         *** (7.3.3) Global symbols.

   Global symbols are those which are defined by paragraphs.


         *** (7.3.4) Strings, Int32, small data and Floats.

         *** (7.3.5) Local symbols.

         *** (7.3.6) Microlocal symbols.

         *** (7.3.7) Closures.

   The code generated must construct a closure. Hence, it begins by the instruction 'alloc'.

       [(closure <lc> (f_micro_ctxt fname ftype (sym . type)...) <args> . <body>)]Ctxt =

          alloc 3+n                 32 bits words (n = size of micro context)
          put_copy y_0              in closure (at 3+0 word)
          ...
          put_copy y_n-1            in closure (at 3+n-1 word)
          put_closure_labels a d
          <end code>

   'n' is the  number of values to be put  in the micro context. Each  value is taken from
   the stack,  virtually copied (if needed),  and put at  its place in the  micro context.
   This is  what the instructions 'put_copy  ...'  are doing.   However these instructions
   are  specialised  depending  on  the  type  of  the datum  to  be  put.   The  function
   'get_closure_copy' returns the list of these instructions.  It receives the description
   of  the micro  context  of  the closure  to  be constructed,  and  the current  context
   (description of the current stack).

   When this is  done, the closure is quite  ready. Only the code of the  function and the
   code for  garbage collection of the  closure is missing.  These codes are put  into the
   closure by the instruction 'put_closure_labels'.

   Of course,  these two codes  must also be  generated by the  compiler. The code  of the
   function is generated as follows:

       label a:
          [<body>]<args>+(<micro_context>)+(ret . k)
          del args
          del_closure
          ret k

   When a function represented by a closure is called the stack is prepared as follows:

                       a_1 a_2 ... a_k  c  r ...

   (the top  of stack on the  left side), where 'a_1'  ... 'a_k' are  the arguments (first
   argument on  top of stack), 'c'  the closure itself,  and 'r' the return  address. Just
   before the function returns the stack must become:

                                           r ...

   So  the   body  of  the  function   is  compiled  relatively  to   a  context  (denoted
   <args>+(<micro_context>)+(ret . k) above) representing  the above stack. When this code
   has finished, the value to be returned by the function is in R.  At that point, we just
   have to delete (virtually) the arguments and the closure from the stack, and return.


         *** (7.3.8) Applicative terms.

         *** (7.3.9) Conditionals.

         *** (7.3.10) Selective conditionals.

         *** (7.3.11) Local definitions ('with ...').

         *** (7.3.12) Delegate.

         *** (7.3.13) Serialize/Unserialize.


      *** (7.4) Peephole optimisation.


   *** (8) Garbage collector code.


   *** (9) Equality code.


   *** (10) Serialisation/unserialisation.

   Serialisation   is   also   called    'marshalling'.    Anubis   1   has   a   built-in
   serialisation/unserialisation mechanism. In the  next version of Anubis, this mechanism
   should be written in Anubis using macros. It will be much simpler (and safer).

   In principle  any abstract datum could be  serialised. In the current  system, only the
   data of the following types are serialisable:

   - the primitive types: String, ByteArray, Int32, Float,
   - defined types, provided that all their components are serialisable.

   Functional types  should in principle  be serialisable, but  they are not in  Anubis 1.
   This would require a more sophisticated  mechanism, which has also something to do with
   dynamically loadable libraries.


   Serialisation is  not performed by a  piece of code,  but by a piece  of 'pseudo-code'.
   The  same   'pseudo-code'  is  used   for  serialising  and  for   unserialising.   The
   'pseudo-instructions' have just a different  execution. Hence the compiler produces (if
   needed) only one such pseudo-code per serialisable type.

   Actually the  pseudo-code is essentially a  description of the  type of the data  to be
   serialised/unserialised.

   Note (2006/08/16): I just remarked, looking at the '.sc' file of the web site, that the
   compiler generates pseudo code paragraphs like this one:

---------|--------------------------------------------------------
         |
         |         * * * type implementation pseudo code * * *
         |
   42901 |  label 590
         |       ;--- stack ------------------------
         |       ;   0                            Var(List(Int32))
         |       ;   1                            (return address)
         |       ;----------------------------------
   42901 |    type_large_switch 1336
   42907 |  label 1336
   42907 |    large_alt_begin 9
   42909 |    indirect_type_mixed 2 591
   42915 |    pop1
   42916 |    large_alt_end 0
   42918 |    jmp 1335
   42923 |  label 1335
   42923 |    swap
   42924 |    ret 1
---------|--------------------------------------------------------

   Strange, isn't  it ? Because 'Var(...)' is  not serialisable ! However,  I also checked
   that such pieces of code  are never called (by 'serialize/unserialize' instructions). I
   also checked with the following example:

global define One
   truc
     (
       List(String) args
     ) =
   forget(serialize(var(args))).

   that  the compiler  doesn't see  the problem.   This is  probably due  to the  very bad
   implementation  of 'Var(...)' which  is implemented  as a  large type,  not as  a 'Var'
   type. This must be  fixed as soon as possible. The best  way is to implement 'Var(...)'
   correctly.


      *** (10.1) How it works.

   Serialising means  'transforming into a sequence  of bytes'. This sequence  of bytes (a
   byte array)  may be saved on  disk, sent over  a network connection etc...   Of course,
   serialisation is interesting because it  does'nt loose information. Serialised data may
   be unserialised.

   Now, when we  are unserialising a byte array,  expecting a datum of a  certain type, we
   are not always sure  that the byte array is the result of  the serialisation of a datum
   of this  type. Hence,  unserialisation may  produce errors. This  is why  the primitive
   'unserialize' returns a datum of type 'Maybe($T)'.

   Notice that when a datum is serialised, its type is not written into the serialisation.
   Hence, when  unserialising a  datum we need  to know  the type of  the datum.  In other
   words, the type is the 'key' (or the 'grammar') for unserialising.

   Serialisation is fairly simple. It works as follows:

     - String: the characters of the string, followed by a zero byte are written.

     - ByteArray:  a 4  bytes word  is written  in little  indian order,  representing the
     number of bytes in the byte array. Then the bytes of the byte array are written.

     - Int32: The integer is written as a little endian 4 bytes number.

     - Float: The floating point number is written in the IEEE754 8 bytes format.

     - small type:  The datum is written either  as a 0 byte,  1 byte, 2 bytes  or 4 bytes
     word.

     - mixed  type: The  index  of the  alternative  is first  written in  the  form of  a
     byte. Then the components are serialised. But in the case of a small alternative, the
     components are serialized as a 'small datum',  i.e. a 32 bits word containing all the
     datum of the  alternative (index included). As a consequence, in  this case the index
     is redundantly serialized. See  the pseudo-instruction 'indirect_type_mixed' for more
     informations.

     - large type: Same as for mixed types.

   The  result of  serialisation  is  a byte  array  which needs  to  be allocated  before
   serialisation begins. The virtual machine has special registers for serialisation:

     U8 *serial_buf;              buffer for serialisation
     U32 serial_size;             size of this buffer
     U32 serial_next;             next free position in the buffer
     Bool unserial_failed         flag for unserialisation

   When  serialisation begins  (i.e. when  the instruction  'serialize' is  executed), the
   datum to  be serialised  is on  top of the  stack. The  instruction 'serialize'  has an
   address  operand  pointing to  the  pseudo  code for  the  type  of  this datum.   This
   instruction works  like a call  (it calls  the pseudo code),  hence must push  a return
   address.   However,  the  return  address  is  inserted just  below  the  datum  to  be
   serialized. The  instruction also changes the  'work sort' of the  virtual machine from
   'computing' to 'serializing',  adn allocates a data segment of  some default size whose
   address  is put  in  'serial_buf'. 'serial_size'  is  intialized to  the  size of  this
   segment, and 'serial_next' is initialized to 8  = 4+4, so as to accomodate room for the
   garbage collector counter  and the word  for the size  of the resulting byte  array. Of
   course, if the segment cannot be  allocated, the instruction gives up, waiting for next
   time.

   Form now on, we are serializing, executing some pseudo-code. This pseudo-code ends with
   the instructions 'revert_to_computing' and 'ret'.

   'revert_to_computing'  changes  the  work   sort  from  'serializing'  to  'computing',
   reallocates the  resulting byte array  at the  right size, puts  the size at  the right
   place  in the  byte  array, puts  the  new  byte array  in  R, and  a  null pointer  in
   'serial_buf'.

   'ret' returns from the pseudo-code to the normal code.


   Unserializing works as follows.

   The byte  array 'b' to be  unserialized is in  R. The 'unserialize' instruction  has an
   address as its unique operand:

   unserialize <addr>

   It is the address of a piece of pseudo-code, namely the pseudo-code for the type 'T' of
   the datum to be extracted from the byte array.

   What 'unserialize' does is just:

        1. put (the address of) 'b' into the 'serial_buf' register, and
           initialize related registers ('serial_size' and 'serial_next'),
        2. put 0 in the flag 'unserial_failed'
        3. push a return address 'r',
        4. put the machine in the state 'unserializing',
        5. jump to <addr>

   Hence, just after 'unserialize' has executed, we have:

        R            stack          serial_buf    unserial_failed
     (empty)         r ...              b               0

   The execution of the pseudo code yields one of the following two situations:

   case 1: unserialization succeeded

        R            stack          serial_buf    unserial_failed
        d              ...              b               0

   case 2: unserialization did not succeed

        R            stack          serial_buf    unserial_failed
        d              ...              b               1

   In both cases, we get a datum 'd' of type 'T'. However,

     - if 'unserial_failed'  is 0,  d is  the datum successfully  extracted from  the byte
       array,

     - if 'unserial_failed' is 1, d is a valid datum of type T, at least from the point of
       view of the garbage collector. Nevertheless, it is non significant.

   The return address 'r' has been removed from the stack.  The byte array 'b' is still in
   the 'serial_buf' register.   At that time, the machine is  still in the 'unserializing'
   state. The next instruction:

        revert_to_computing <width>

   has  the responsability  of testing  'unserial_failed', virtually  delete 'b',  put the
   machine in the state 'computing', and:

   case 1 (unserail_failed == 0):

     - put 'd' into a 'success', so as to construct a datum of type 'Maybe(T)', and push a
       zero on the stack. Hence, the situation becomes:

        R                stack           serial_buf        unserial_failed
        success(d)       0 ...           (empty)           (empty)

   case 2 (unserail_failed == 1):

     - push 'd' on the stack, and put  a zero ('failure' of type 'Maybe(T)') in R.  Hence,
       the situation becomes:

        R               stack           serial_buf        unserial_failed
        failure         d ...           (empty)           (empty)

   The next instruction is:

        del_stack_? 0

   which will delete correctly the top of  stack, hence yielding in any case the following
   situation:

        R            stack          serial_buf    unserial_failed
        result           ...           (empty)           (empty)

   where result is of type 'Maybe(T)'.

   So, if the initial situation is:

        R         stack
        (empty)       ...

        after execution of:

        [term]
        unserialize <addr>
        revert_to_computing <width>
        del_stack_? 0

   it becomes:

        R         stack
        d           ...

   where 'd' is the result of the unserialization (of type 'Maybe(T)').

   Notice that the <width> operand is  used by 'revert_to_computing' on order to construct
   'success(d)'. Indeed,

     - if <width> <= 31,   success(d) is just     (d<<1)|1
     - if <width> >= 32,   success(d) requires a new data segment of two words, with:

        - at offset 0:   counter (= 1)
        - at offset 4:   d

   and  the pointer  to this  data  segment must  be ORed  with  1 (mixed  index for  type
   'Maybe').

   To summarize, the code for (unserialize <lc> <type> . <term>) is:

        [<term>]
        unserialize <addr>
        revert_to_computing <width>
        <end code>

        except if the declared type is ByteArray. In that case, we have:

        [<term>]
        success 32            needed to put the byte array into a 'success'
        <end code>


      *** (10.2) Descriptions of small types.

   They are lists on integers defined by the following grammar:

   <na> :=   8 bits integer (number of alternatives)
   <iw> :=   8 bits integer (index width)
   <nc> :=   8 bits integer (number of components)

   <small type descr> :=    <iw> <na> <alt descr> ...   (as many <alt descr> as <na>)

   <alt descr> := <nc> <small type descr> ...   (as many as <nc>)

   Hence a small type description is a list of 8 bits integers. These integers become part
   of  serialization/unserialization instructions  like 'type_8',  'type_16'  etc...  (see
   'vminstr.c'). One question is: `how long this  description may be ?'. The answer is not
   so obvious !  Remember that the number  of alternatives is limited to 256, and likewise
   for the  number of components in  an alternative.  Now, if  'iw' is the  number of bits
   required for the index,  the total width of the components is at  most: 'bw - iw' where
   'bw' is  the bit width of  our type.  So, let  'dl(iw,bw)' be the maximal  length for a
   description of a type of index width 'iw' and bit width 'bw'. We have,

                       dl(iw1,bw1) =< 2 + (2^iw1 * (1 + bl(iw2,bw1 - iw1)))


      *** (10.3) Pseudo-instructions set.

   The official list of pseudo-instructions  is defined in 'bytecode.h'. Recall that these
   instructions have two meanings: one for serialisation and one for unserialisation.


                             action while                      action while
                             serialising                       unserialising
   ---------------------------------------------------------------------------------------
   swap                      swaps the top of stack            nothing

   type_0                    nothing                           nothing
   type_8
   type_16
   type_32
   type_String
   type_ByteArray
   type_Int32
   type_Float

   indirect_type_0
   indirect_type_8
   indirect_type_16
   indirect_type_32
   indirect_type_String
   indirect_type_ByteArray
   indirect_type_Int32
   indirect_type_Float
   indirect_type_mixed
   indirect_type_large

   type_mixed_switch
   type_large_switch


   mixed_alt_begin
   large_alt_begin
   mixed_alt_end
   large_alt_end

   type_small_alt

   revert_to_computing

   type_mixed (obsolete; never used)
   type_large (obsolete; never used)
   small_mixed_alt (obsolete; never used)


      *** (10.4) Serialising.


      *** (10.5) Unserialising.


      The machine is now doing unserialization. When unserialization begins (just after
      the instruction 'unserialize'), the machine is in the following state:

                R            stack          serial_buf    unserial_failed
             (empty)         r ...              b               0

      where 'r' is a  return address (to be removed by the 'ret' at  the end of the pseudo
      code, and where 'b' is a byte array.

      During unserialization, the registers serial_buf and serial_size remain constant. On
      the contrary,  serial_next is incremented while  the byte array is  read.  It always
      gives the offset (relative to serial_buf) of the next byte to be read.

      Below, we simulate  a successful unserialization for various  types. The instruction
      'swap' does nothing during unserialization.

      Below, 'p' is a pointer to a newly allocated segment, whose size (in bytes) is given
      by  the  instruction  ?_alt_begin.   'e'  is  the  result  of  unserialization  when
      instructions call  another pseudo code.  'pop1'  does not pop anything,  but put the
      content of  R at the  address pointed to  by the top  of stack, and  increments that
      pointer.  'indirect' instructions  which do not call another  pseudo code, put their
      result at the address pointed to by the top of stack.

small type: Bool
                                                   R            stack
      50 |  label 20                            (empty)         r ...
      50 |    type_8                               d            r ...
      51 |    swap
      52 |    (ret . 1)                            d              ...


primitive type: String

                                                   R            stack
      50 |  label 4                             (empty)         r ...
      50 |    type_String                          d            r ...
      51 |    swap
      52 |    (ret . 1)                            d              ...


mixed type: List(String)

                                                   R            stack
      50 |  label 15                            (empty)         r ...
      50 |    (type_mixed_switch 2 30 29)
      61 |  label 30                            (empty)         r ...
      61 |    type_32                              d            r ...
      62 |    (jmp . 28)
      67 |  label 29
      67 |    (mixed_alt_begin . 12)            (empty)  p+4  p r ...
      69 |    indirect_type_String              (empty)  p+8  p r ...
      70 |    (indirect_type_mixed 2 . 15)         e     p+8  p r ...
      76 |    pop1                              (empty)  p+12 p r ...
      77 |    (mixed_alt_end . 1)               d = p|1         r ...
      79 |    (jmp . 28)
      84 |  label 28
      84 |    swap                                 d            r ...
      85 |    (ret . 1)                            d              ...


large type: Printable_tree          (defined in basis.anubis)

                                                   R            stack
     130 |  label 20                            (empty)         r ...
     130 |    (type_large_switch 5 42 41 40 39 38)
     152 |  label 42
     152 |    (large_alt_begin . 5)             (empty)  p+5  p r ...
     154 |    (large_alt_end . 0)               d = p|0         r ...
     156 |    (jmp . 37)
     161 |  label 41
     161 |    (large_alt_begin . 13)            (empty)  p+5  p r ...
     163 |    indirect_type_String              (empty)  p+9  p r ...
     164 |    (indirect_type_large . 20)           e     p+9  p r ...
     169 |    pop1                              (empty)  p+13 p r ...
     170 |    (large_alt_end . 1)               d = p|1         r ...
     172 |    (jmp . 37)
     177 |  label 40
     177 |    (large_alt_begin . 10)            (empty)  p+5  p r ...
     179 |    indirect_type_8                   (empty)  p+6  p r ...
     180 |    (indirect_type_large . 20)           e     p+6  p r ...
     185 |    pop1                              (empty)  p+10 p r ...
     186 |    (large_alt_end . 2)               d = p|2         r ...
     188 |    (jmp . 37)
     193 |  label 39
     193 |    (large_alt_begin . 13)            (empty)  p+5  p r ...
     195 |    indirect_type_Int32               (empty)  p+9  p r ...
     196 |    (indirect_type_large . 20)           e     p+9  p r ...
     201 |    pop1                              (empty)  p+13 p r ...
     202 |    (large_alt_end . 3)               d = p|3         r ...
     204 |    (jmp . 37)
     209 |  label 38
     209 |    (large_alt_begin . 13)            (empty)  p+5  p r ...
     211 |    (indirect_type_large . 20)           e     p+5  p r ...
     216 |    pop1                              (empty)  p+9  p r ...
     217 |    (indirect_type_large . 20)           e     p+9  p r ...
     222 |    pop1                              (empty)  p+13 p r ...
     223 |    (large_alt_end . 4)               d = p|4         r ...
     225 |    (jmp . 37)
     230 |  label 37
     230 |    swap                                 d            r ...
     231 |    (ret . 1)                            d              ...


    Hence, instructions work as follows during unserialization (assuming no error occurs).

      'direct type' instructions:

        type_0
        type_8
        type_16
        type_32
        type_Int32
        type_String
        type_ByteArray
        type_Float
        --> unserialize a datum (by reading the byte array) and put it in R

      'switch' instructions:
        type_mixed_switch
        type_large_switch
        --> unserialize an index (always a byte), check it using second operand (number of
          alternatives), and jump to the right label.

      'alt_begin' instructions:
        mixed_alt_begin
        large_alt_begin
        --> allocate a segment, push the pointer, duplicate it, and add either 4 or 5 to the
          copy.

      'indirect non calling type' instructions:
        indirect_type_0
        indirect_type_8
        indirect_type_16
        indirect_type_32
        indirect_type_Int32
        indirect_type_String
        indirect_type_ByteArray
        indirect_type_Float
        --> unserialize a datum and put it at the address pointed to by pointer on top of
          stack. Increment this pointer by the size of the datum.

      'indirect calling type' instructions:
        indirect_type_mixed
        indirect_type_large
        --> Call a pseudo code. This has the effect of putting an unserialized datum in R
         upon return.

      'pop1'
        --> put the content of R at the address pointed to by the top of stack, and increment
         that pointer (always by 4).

      'alt_end' instructions:
        mixed_alt_end
        large_alt_end
        --> remove the pointer on top of stack, put the next one in R, and OR it with the
         index (given by the instruction itself).


    Now, unserialization may  fail, because the bytes in the byte  array are arbitrary. As
    soon as such  a failure is detected, the  flag 'unserial_failed' is set to  1. In that
    case,  we must  continue  a 'fake'  unserialization,  because, we  must complete  data
    partially constructed.   This is the reason  why all instructions must  first test the
    flag.

    If the flag is 0 they operate  as described above. Nevertheless, since they may detect
    a problem they may also put 1 in the flag.

    'Fake execution' of the above instructions is as follows:

      'direct type' instructions:    put a value of 0 in R as the result of the unserialization
      'switch instructions':         jump to the first alternative, which always exists
      'alt_begin' instructions:      do not allocate a segment, but do as if, by pushing a NULL
                                       pointer and duplicating it (hence we have two NULL pointers
                                       on top of stack
      'indirect non calling type' instructions: there are two cases:
         - the two pointers on top of stack are NULL:
             do not do anything
         - the two pointers on top of stack are non NULL:
             put a value of zero as the result of unserialization, and increment the topmost
             pointer by the right value.
      'indirect calling type' instructions: again there are two cases:
         - the two pointers on top of stack are NULL:
             do not do anything
         - the two pointers on top of stack are non NULL:
             do not call any pseudo MAM(m_code_begin), and put 0 in R.
      'pop1': two cases:
         - the two pointers on top of stack are NULL:
             do not do anything
         - the two pointers on top of stack are non NULL:
             put 0 at the address pointed to by the top of stack, increment the pointer.
      'alt_end' instructions: two cases:
         - the two pointers on top of stack are NULL:
             remove the two NULL pointers and put a NULL pointer in R,
         - the two pointers on top of stack are non NULL:
             remove the topmost pointer, and put the other one in R, and OR it with the index.

      Notice that any instruction which detects a failure, and puts 1 in the flag, should do as if
      the flag was already set.


      Finally, we must also discuss 'revert_to_computing'. After the unserialization, the state of
      the machine is:

                R            stack          serial_buf    unserial_failed
                d              ...              b               0/1

      where 'd' is a datum, which is significant only if unserial_failed is 0.

      In any case, the byte array in serial_buf must be virtually deleted, and a zero must
      be put in serial_buf (this will be checked by 'unserialize'/'serialize').

      If unserial_failed is:

            0      then  construct 'success(d)' in R (use the <width> operand to know the
                     bit width of type of d), and push a zero in the stack, so that
                     the next instruction (del_stack_?) may operate.
            1      then  push R in the stack (it will be deleted by the next instruction),
                     and put 0 (failure of type 'Maybe(T)') in R.

      Note: the sequence of instructions:

          unserialize <addr>
          revert_to_computing <width>

      is always followed by:

      del_stack_? 0               (for the type of the unserialized datum)


   *** (11) Outputing the code (making an *.adm file).


   *** (12) How to make a system of macros ?


      *** (12.1) Overview.

   Anubis has no system  of macros, but should have one. This  would (as already remarked)
   induce  many simplification  in  the system,  and  at the  same  time provide  powerful
   programming tools for the  user. The best system of macros ever  created is that of the
   Lisp language. This system was natural  in Lisp because of the particularly simple form
   of the syntax.  Of  course, the syntax of Anubis is much more  complex, but as we shall
   see, constructing a system 'à la Lisp' is still possible.

   What a language is good for ?   It is good for representing 'things' (belonging to some
   'target' world) by  'expressions'. For example, the natural  language represents things
   of the real word by words, sentences, etc...  but also abstract things like statements,
   etc...  Of course, each  language has  a syntax  (grammar) but  also a  semantics.  The
   'semantics' is  the correspondance between  the expressions (the 'signifiers')  and the
   thing they represent (the 'signified').

                                       the semantics
                                             |
            symbolic side (language)         |        semantic side (target world)
        -------------------------------------+----------------------------------------
                                             |
                                             |
                 expression   |----------------------------> thing
                (signifier)                  |               (signified)
                                             |


   Remark  that in most  cases (programming  languages, mathematics,  but not  the natural
   language),  the  symbolic side  is  more concrete  (less  abstract)  than the  semantic
   side. Indeed, for example in mathematics, we have many expressions for representing the
   number (say) 2, like 2, 1+1, 5-3, etc... and these expressions may be manipulated using
   computation  rules. So,  this  is in  some sens  very  concrete. On  the contrary,  the
   signified,  i.e. the  mathematical object  2,  is a  complete abstraction  by its  very
   nature.

   So, we use  the language for manipulating the  things (or at least we  give ourself the
   illusion that  we manipulate the  things, because what  we actually manipulate  are the
   expressions, for example when we are computing). This leads us in some circumstances to
   write down several  times the same expressions or at least  similar expressions. So, we
   may want to make  this writing automatic. In order to do that,  we need to consider the
   expressions themselves as a new sort of  things (the 'meta-things') and to create a new
   language (the 'meta-language') for manipulating the meta-things:

                         meta-semantics                semantics
          meta-language         |     language              |   target world
       -------------------------+---------------------------+--------------------------
                                |                           |
         meta-expression |-----------> expression |--------------> thing
                                |                           |
                                |                           |

   In most cases,  the meta-language is different from the language  and more simple. This
   is the  case for the  C language,  where the expressions  of the meta-language  are all
   those which begin by: #define #if etc... (a very poor language actually).

   However  in Lisp, the  meta-language is  Lisp itself.  The creators  of Lisp  were much
   inspired when they did that. This has several consequences:

      1. The meta-language is as powerful as the language itself.
      2. The language is also the meta-meta-language, the meta-meta-meta-language, etc...

   Fans of Lisp (like me) know how this feature is powerful.


      *** (12.2) The trick.

   In order  to have  the language be  its own  meta-language, it is  enough to  embed the
   language  into its  own target  universe.  In  other words,  it should  be  possible to
   consider all expressions of the language as data of the target universe:


                      +------------------------------------------------+
                      |                                                |
                      |  target universe (all sorts of data)           |
                      |                                                |
                      |                                                |
                      |                                                |
                      |                                                |
                      |                                                |
                      |       +--------------------------+             |
                      |       |                          |             |
                      |       |   the language           |             |
                      |       |                          |             |
                      |       |  (a subset of the        |             |
                      |       |         universe)        |             |
                      |       |                          |             |
                      |       |                          |             |
                      |       +--------------------------+             |
                      |                                                |
                      |                                                |
                      |                                                |
                      +------------------------------------------------+

   Hence  the  language,  being  able  to  represent all  data,  may  also  represent  the
   expressions of the  language itself. It may also be able  to represent expression which
   themselves represent expressions  and so on, so  that the language is at  the same time
   the meta-language, the meta-meta-language, and so on... This is the big trick.


      *** (12.3) How it works.

   Basically, we  need only a set  of types (called 'syntactical  types') for representing
   expressions, and an operator (named  the 'expansion' operator) for forcing the compiler
   to evaluate data of these types. We  may also have a 'quoting' operator for simplifying
   the writting of data of these types.

   One  of the  main types  for representing  expressions in  Anubis should  be a  type of
   'terms':

   public type Term:
      symbol(String name),
      applicative(Term function, List(Term) arguments),
      function(List(Declaration) arguments, Term body),
      conditional(Term test, List(Case) cases),
      ...

   The  quoting operator  (denoted:  ' )  should simplify  the  writting of  data of  type
   'Term'. For example, instead of writting:

      conditional(symbol("x"),
                  [
                    case([("failure",[])],symbol("false")),
                    case([("success",[symbol("u")])],
                             applicative(symbol("f"),[symbol("u")]))
                  ]
                 )

   we would have just to write this:

       'if x is
          {
            failure    then false,
            success(u) then f(u)
          }

   (Remark the quoting mark  in front of this term.) The above term  is not of type 'Bool'
   but just  of type  'Term'.  Thus, the  quoting mark  just means: 'don't  interpret this
   term, just consider it as a term'.

   Now, we may write down functions for producing terms (of type 'Term'), type definitions
   and  all  sorts  of  'syntactical  concepts',  considered as  data  of  our  semantical
   universe. From now on, the expressions of  the language may be manipulated as any other
   kind of data.

   Manipulating  expressions as data  is called  'meta-programming'. But  this is  not the
   whole story, because the  destiny of a datum of type 'Term' or  of some similar type is
   at the very end to be interpeted (evaluated).

   When we want to evaluate a datum (say 't') of type 'Term', we just have to write:

      @t

   '@' is the 'expansion'  (or 'evaluation') operator. In some sens it  is the converse of
   quoting. It transforms a datum of type  (say) 'Term' into what this datum represents if
   we consider it as an expression.


      *** (12.4) An example.

   The  most difficult  is to  design the  types: 'TypeDefinition',  'Definition', 'Term',
   'Case', etc... They should  be the perfect image of the whole  language. As an example,
   we consider only a small fragment of this family.

public type Type:
  scheme     (String name, List(Type) operands),
  functional (List(Type) source_types, Type target_type),
  ...

public type Component:
  anonymous  (Type type),
  named      (Type type, String name).

public type Alternative:
  singleton  (String name),
  usual      (String name, List(Component) components).

public type TypeDefinition:
  public     (String name,
              List(String) parameters,
              List(Alternative) alternatives),
  private    (String name,
              List(String) parameters,
              List(Alternative) alternatives).

public type Paragraph:
   type_definition(TypeDefinition),
   ...


   Now, we may want to define this

define TypeDefinition
  enumerated_type
    (
      String       type_name
      List(String) alternative_names
    ) =
  private(type_name,
          [],
          map((String n) |-> singleton(n), alternative_names).


define List(String) my_names = ["ga","bu","zo", "meu"].

@enumerated_type("Shadok",my_names).

   In this last paragraph (a 'macro-paragraph' beginning by '@'), the term:

                                enumerated_type("Shadok",my_names)

   is of type  'TypeDefinition'. Because of the presence of '@',  the compiler will expand
   it into an actual paragraph, namely this one:

type Shadok:
   ga,
   bu,
   zo,
   meu.


   The syntax of 'macro-paragraphs' is:

@t

   (with  '@'  in  the  leftmost  column)  where  t is  a  term  of  type  'Paragraph'  or
   'List(Paragraph)'. In  the case of a list  of paragraphs the compiler  expands the term
   into a sequence of paragraphs.


   Of course, the expansion operator should also be usable for the expansion of terms (and
   any other  sort of syntactical entity), not  only for paragraphs.  For  example, we may
   write something like this:

define Term
   my_cond
     (
       Term          test
       List(String)  alts
     ) =
   conditional(test,
               map((String n) |-> singleton_case(n,length(n)),
                   alts)).

   Now, we may write (within a paragraph):

                                    @my_cond(_x,my_names)

   which will be expanded into the term:

        if x is
          {
            ga   then    2,
            bu   then    2,
            zo   then    2,
            meu  then    3
          }

   where 'x' is the expansion of '_x'.

   The expansion may also  be implicit. For example if a term  has an interpretation whose
   type is a syntactical  type, the compiler may also try to  interpret its expansion. And
   conversely  the quoting  may also  be implicit.  However, this  may eventually  lead to
   problems and this should be experimented.