Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] View of /sml/trunk/src/ml-nlffi-lib/Doc/mini-tutorial.txt
ViewVC logotype

View of /sml/trunk/src/ml-nlffi-lib/Doc/mini-tutorial.txt

Parent Directory Parent Directory | Revision Log Revision Log


Revision 831 - (download) (annotate)
Tue May 22 20:29:39 2001 UTC (19 years, 6 months ago) by blume
File size: 32297 byte(s)
ml-nlffigen bug fix; mini-tutorial
ML-NLFFI library and ML-NLFFIGEN glue code generator
====================================================

A very incomplete introduction
(by Matthias Blume (blume@research.bell-labs.com))


The new NLFFI ("no-longer foreign function interface") is based on the
idea of data-level interoperability:  ML code (a mixture of
pre-defined code imported from $/c.cm, code generated by ml-nlffigen,
and code the user writes) operates directly on C datastructures
without any marshalling/unmarshalling.  There are no C stub routines
(no C glue code at all), and very little code on the ML side, just
enough to deal with "new" types (struct/union), with generating code
for C function calls, and with dynamic linking.

There are three libraries that are part of ml-nlffi-lib, accessible
from CM as $/c.cm, $/c-int.cm, and $/memory.cm, but a user of this FFI
only needs one: $/c.cm.

Library $/c.cm implements an encoding of the C type system in ML
types.  This is exported as structure C.  Moreover, there is a
structure DynLinkage that handles dynamic linking.
For details on structure C, see src/ml-nlffi-lib/c.sig.

Thanks to ML's type inference, it is usually not necessary to spell out
many (if any) of the (rather complicated!) types exported by structure
C to be able to use this FFI.

Conversely, at least in theory, if you are a competent ML programmer
but don't know C, then you could simply run the C code through
ml-nlffigen and read the signatures it produces...

--------------------------------------------------------------------

An example:

Suppose you have a shared library nodelist.so that exports a global
function to generate lists of nodes.  A C header file node.h explains
the interface:

   struct node {
     int i;
     struct node *next;
   };

   /* produce n-element node list where first node's i is first,
    * and where (x->next->i - x->i == incr) for all nodes except the
    * last: */
   struct node *gen (int n, int first, int incr);

We run this header file through our FFI generator:

   $ ml-nlffigen node.h

The result is a new CM library described by node.h.cm (which, in turn,
is implemented by node.h.sig and node.h.sml).  The library exports a
structure Node which contains a functor NodeFn.  We need to write some
ML glue code to instantiate this functor.  In simple cases like ours,
the only argument the functor needs is a handle on the dynamic library
(nodelist.so).  So we make a file node-glue.sml and write:

  structure Node =
    Node.NodeFn (val library = DynLinkage.open_lib { name = "./nodelist.so",
	  					     global = true,
						     lazy = true })

[Structure DynLinkage is an interface to dlopen/dlsym.  To get access
to symbols that are already linked into the main program (i.e., SML/NJ's
runtime system), use DynLinkage.main_lib.]

With this preparation we can now write a "client" module
(node-client.sml) that contains code to inspect results from calling
function gen.  As an example, let us write two functions "len" and
"sum" which calculate the length of a list and the sum of a list's
elements, respectively, as well as a procedure "incall" which
traverses a list and increments every "i":

structure NodeClient = struct
    fun len l =
        if C.Ptr.isNull l then 0
        else 1 + len (C.Get.ptr (Node.S_node.f_next (C.Ptr.|*| l)))

    fun sum l =
        if C.Ptr.isNull l then 0
	else let val n = C.Ptr.|*| l
	         val i = C.Cvt.ml_sint (C.Get.sint (Node.S_node.f_i n))
		 val next = C.Get.ptr (Node.S_node.f_next n)
	     in
		 i + sum next
	     end

    fun incall l =
        if C.Ptr.isNull l then ()
	else let val n = C.Ptr.|*| l
                 val iobj = Node.S_node.f_i n
	         val i = C.Cvt.ml_sint (C.Get.sint iobj)
		 val next = C.Get.ptr (Node.S_node.f_next n)
	     in
	         C.Set.sint (iobj, C.Cvt.c_sint (i + 1));
		 incall next
	     end
end

Notice how a combination of operators from the predefined structure C
(exported from $/c.cm) and operations from structure Node (resulting
from our instantiation of the ML-NLFFIGEN-generated functor
Node.NodeFn) was sufficient to traverse a C data structure, to inspect
its every detail, and to even modify it.

Here is the key to this code:

   ML                             C

   C.Ptr.isNull <ptr>             <ptr> == NULL
   C.Ptr.|*| <ptr>                *<ptr>
   Node.S_node.f_next <struct>    <struct>.next
   C.Get.<foo> <obj>              (lvalue in an rvalue context;
                                   this is a fetch from memory which in C
                                   happens implicitly when an lvalue turns
                                   into an rvalue)
   C.Set.<foo> (<obj>, <value>)   <lvalue> = <rvalue>;
   C.Cvt.ml_<foo> <value>         abstract C value -> concrete ML value
   C.Cvt.c_<foo> <value>          concrete ML value -> abstract C value

We can wrap all this up and make it into a CM library (node.cm):

  Library
	structure Node
	structure NodeClient
  is
	$/basis.cm
	$/c.cm
	node.h.cm
	node-glue.sml
	node-client.sml

A better way of doing this -- automating the task of invoking
ml-ffigen -- would be:

  Library
	structure Node
	structure NodeClient
  is
	$/basis.cm
	$/c.cm
	node.h : shell (target:node.h.cm
                        ml-nlffigen %s)
	node-glue.sml
	node-client.sml

-------------------------------------------------------------------------
Despite the fact that one usually does not need to deal with types
very much (thanks to ML's type inference), I will now briefly describe
the main ideas behind the types of the C module.  I will generally
omit the "C." prefix, assuming a global "open C" to be in effect.

   1. Objects:

   Objects describe locations in memory that hold values of some C
   type.  (This roughly corresponds to C's notion of lvalues, although
   not every object can appear on the left-hand side of an assignment
   operator.  For example, array objects cannot.)

   1.1 Object types:

   The ML type of objects is

        type ('t, 'f, 'c) obj

   Here, 't is a "phantom type" that describes the type of the value
   stored in the object, 'c is the "constness" of the object (i.e.,
   "ro" or "rw" --- depending on whether there was a "const" qualifier
   in the C declaration or not), and 'f is a typing artifact having to
   do with the treatment of function pointers.  (For objects where the
   instantiation of 't is not somehow based on a function pointer
   type, 'f will always be "unit".  For instances of 't that contain
   the type phrase F fptr, 'f is going to be instantiated to F.)

   1.2. Fetching and storing:

   For certain types 't, there are fetch and store operations for the
   corresponding objects.  See substructures "Get" and "Set".

   If a type T has fetch/store operations for (T, ?, ?) obj, then we
   call values of type T "first-class C values".  For first-class
   values, the phantom type coincides with the type of the value.  (For
   other (second-class) values, the phantom type is a true phantom
   type because there are no constructable values.  Second-class C
   values do not exist outside of their corresponding objects.)

   2. Base types:

   Base types to be substituted for 't and their corresponding C types
   are given below:

      ML         C

      schar      signed char
      uchar      unsigned char
      sint	 signed int
      uint	 unsigned int
      sshort	 signed short
      ushort	 unsigned short
      slong	 signed long
      ulong	 unsigned long
      float	 float
      double	 double
      voidptr	 void *

   Notice that there is no equivalent for "void" since it is not a
   "true" type in C either but has many different meanings depending
   on the context where it is used.

   All types given above are abstract.  To convert to or from concrete
   ML types, use Cvt.ml_<foo> and Cvt.c_<foo>.  These routines exist
   for all of the above types except voidptr.  They convert to and
   from certain INTEGER, WORD, and REAL types which are collectively
   defined in structure MLRep.  For example, the x86 version of
   structure MLRep.SInt is the same as Int32 and MLRep.Float as well
   as MLRep.Double are the same as Real64.  (Notice that the ML
   representation type for different C types can be the same, but the
   C types themselves are kept distinct to enforce a typing discipline
   that is equivalent to what a C compiler would do.)

   3. Pointers:

   Pointers are first-class C types.  Their ML type is

      type ('t, 'f, 'c) ptr

   A pointer of type (T, F, C) ptr points to an object of type
   (T, F, C) obj.  One can obtain the object by applying the Ptr.|*|
   operator.  Ptr.|&| goes the other way around.

   Pointers permit pointer arithmetic just like in C using Ptr.|+|
   (for adding an integer to a pointer) and Ptr.|-| (for subtracting
   two pointers).  A pointer can be injected into the voidptr domain
   using Ptr.inject.  (It can also be recovered (projected) from the
   voidptr domain using Ptr.project, but this requires run-time type
   information. See below.)

   Since they are first-class, pointers can be fetched from and stored
   into pointer objects (of type (('t, 'f, 'pc) ptr, 'f, 'c) obj,
   where 'pc is the constness of the object pointed to by the pointer
   and 'c is the constness of the object containing the pointer).

   The Ptr.sub operation is a shorthand for a combination of Ptr.|+|
   and Ptr.|*|.  (Or, alternatively, Ptr.|*| is the same as
   fn p => Ptr.sub (p, 0).)

   4. Arrays:

   Arrays are second-class values.  Their (phantom) type is

     type ('t, 'f, 'n) arr

   Here, 't is the type of the values stored in the array's individual
   elements, 'f here is the same as the 'f in the case of obj or ptr,
   and 'n is a type describing the size of the array.

   4.1.  Array dimensions:

   The Dim substructure defines an infinite family of types in such a
   way that there is a 1-1 correspondence between natural numbers and
   this family.  In particular, if a positive natural number is
   written in decimal and without leading zeros as <dn>...<d1><d0>,
   where <di> are decimal digits, then the corresponding Dim type is

      dec dg<dn> ... dg<d1> dg<d0> dim

   which happens to be an abbreviation for
      
      (dec dg<dn> ... dg<d1> dg<d0>, nonzero) dim0

   (In case you wonder: The type corresponding to 0 is (dec, zero) dim0.)

   The connection to array types is this:  An array of size N has type

      ('t, 'f, [N]) arr

   iff "[N] dim" is the type assigned to N by our Dim construction.

   Example (assume "open Dim"):

   The C type (int[312]) is encoded as

      (sint, unit, dec dg3 dg1 dg2) arr

   In other words, if you "squint away" the "dec", the "dg"s, and the
   spaces, then the array dimension gets spelled out in decimal.

   4.2. Operations over arrays:

   Since array types are second-class, there are no operations that
   produce or consume values of type (?, ?, ?) arr.  Instead, we use
   array objects of type ((?, ?, ?) arr, ?, ?) obj.

   Most operations related to array objects are in substructure Arr.

   Array subscript takes an array object and an integer i and produces
   the object describing the i-th element of the array.  It is
   implemented in such a way that it performs bounds-checking: if i<0
   or i>=N where N is the array's size, then General.Subscript will be
   raised.

   To get C's behavior (no bounds checks), one can use pointer
   subscript instead.  This requires to first let the array "decay"
   into a pointer to its first element.  In C this happens implicitly
   in many situtations, but in ML one must ask for it explicitly by
   invoking Arr.decay.

   Given a value of type 'n Dim.dim one can reconstruct the array from
   the pointer to its first element.

   5. Function pointers:

   Function pointers have type 'f fptr where 'f is always instantiated
   to (A -> B) for some A and B.  This instantiation for 'f propagates
   through all those 'f components of obj-, ptr-, or arr-types whose
   't component somehow involves the fptr-type.

   A function pointer of type (A -> B) fptr can be invoked with an
   argument of type A and yields a result of type B by invoking the
   "call" operator:

       val call: ('a -> 'b) fptr * 'a -> 'b

   Function pointers are first-class C values and can be stored in
   function-pointer-objects as usual.

   The ML-FFIGEN program generator tool will arrange for every C
   function prototype that occurs in a given piece of C code to define
   a corresponding (A->B) fptr type.  Here, A is derived from the
   argument list of the C function and B describes the result type.
   In particular, here is what happens:

       0. Vararg functions are not handled.

       1. If the argument list is (void) and the result type is not a
       struct or union type, then A is unit.

       2. For the case of non-empty argument lists where the types of
          the arguments are C types t1 ... tk, we form a "preliminary
	  ML argument list" [t1] ... [tk] as follows:
          - If ti is a first-class C type, then [ti] is the
	    (light-weight version (see below) of the) corresponding ML
            type describing ti.
          - Otherwise, ti must be a struct or union type.  For each
            struct or union type, the ML-FFIGEN tool will generate a new
            fresh phantom type X (as described later).  A function
            argument of such a type will be (X, unit, ro) obj'.  This
            is, on the ML side the function will expect a read-only
            struct or union object.
            (Notice the primed type "obj'"! We pass structs in
             light-weight form.  For an explanation of "light-weight",
             see the discussion below.)

       3. If the result is of struct or union type Y, then an additional
          argument of type (Y, unit, rw) obj' is prepended to the
          preliminary argument list.  This means that on the ML side
          functions "returning" a struct or union must be passed a
	  corresponding writable struct or union object. 

       4. Let the final argument type list (formed in step 2. or 3.) be
          x1 ... xn.  Type A will be the tuple x1 * ... * xn.  In
          particular, if there is only one type x1, then A = x1.

   The result type B is formed as follows:

       1. If the C return type is "void", then B is "unit".
       2. If the C return type is a struct or union, then B
          coincides with the type of the first argument, i.e.,
          it is the same as the first element of the tuple that is A.
          (On the ML side, the function, when called, will return its
           first argument after having stored the struct or union
           that was returned by the C function into it.)
       3. Otherwise the return type must be a first-class C type and
          B will be that type's (light-weight) ML-side representation.

   6. Run-time type information:

   For every object of type ('t, 'f, 'c) obj there is corresponding
   run-time type information that describes values of type 't.  RTI is
   used mainly to keep track of size information (needed for pointer
   arithmetic), but it also facilitates array bounds checking.

   Most of the time this information is kept completely behind the
   scenes, but in some situations the programmer might want to use it
   directly.

   In the part of the interface that has been described up to here,
   there is really only one place that requires run-time type
   information: Ptr.project.  A voidptr together with type information
   describing a non-void pointer's target type can be used to "cast" the
   voidptr to that pointer type.

   RTI is used extensively in the other "light-weight" part of the
   interface.  (See below.)  It can be extracted from existing objects
   using T.typeof or can be constructed directly using the value
   constructors of substructure(s) T (and Dim).

   Example, RTI for a 12-element array of pointers to constant ints:

              let open C open Dim in
                  T.arr (T.ro (T.ptr T.sint), dec dg1 dg2 dim)
              end

   (Note: The "dec dg1 dg2 dim" in the example above is an
    _expression_ that returns a Dim.dim value.  And, by construction,
    the type of that expression also happens to be "dec dg1 dg2 dim".)

   7. Light-weight interface:

   The concrete representation for values of obj-, ptr-, and fptr-type
   carries run-time type information.  This makes the interface
   convenient to use, because RTI is hidden behind the scenes.  It is
   also somewhat inefficient because RTI must be tracked (and operated
   upon) for most operations.

   Light-weight versions of these types (constructors carry a prime in
   their names): "obj'", "ptr'", "fptr'") do not use RTI in their
   concrete representations.  This is more efficient for all
   operations that don't need access to RTI.  On the downside, it
   means that RTI must be passed in explicitly by the programmer for
   operations that do.

   To make passing of type information statically safe (i.e., to
   disallow mixing a C value of one type with type information
   corresponding to a different type), RTI itself has a static ML
   type.  In particular, the RTI for a value stored in a "('t, 'f, 'c)
   obj" object will have type "('t, 'f) T.typ".

   Array subscript, to name one example, on light-weight array objects
   enforces correct usage of RTI using ML's static typing:

	Arr.sub' : (('t, 'f, 'n) arr, 'f) T.typ ->
		   (('t, 'f, 'n) arr, 'f, 'c) obj' * int -> ('t, 'f, 'c) obj'

   7.1 Light vs. heavy:

   One can convert between light and heavy versions by using the
   functions in substructures Light and Heavy.

   7.2 Slimmed-down RTI: Run-time size information

   Our RTI contains a lot of information that is not needed in many
   situations.  For example, we can extract RTI for a pointer's
   element type from the RTI for the pointer type.  In many cases all
   we need is _size_ information (which, internally, is just number).
   Definitions pertaining to run-time size information are collected
   in substructure S.  Like RTI itself, we give static types to sizes:
    
         type 't size

   Size information can be obtained from RTI (but not vice versa):

         T.sizeof : ('t, 'f) T.typ -> 't S.size

   Light-weight pointer arithmetic uses size information for the
   element type:

         Ptr.|+! : 't S.size -> ('t, 'f, 'c) ptr' * int -> ('t, 'f, 'c) ptr'

   (NB: C types are monomorphic. In ML programs we can precompute size
    info for any monomorphic type, so with a bit of help from a
    cross-module inliner and the compiler's value-propagation- and
    constant-folding phases we should see machine code very similar to
    what a C compile would produce.)

   8. Struct- and union-types:

   A struct- or union-declaration in C declares a brand-new type.  In
   C, struct- and union-types are of class "one-and-a-half", so to
   speak.  They are not truly first-class because the only operations
   on values of these types end up being what amounts to "copy"
   operations from objects to other objects.  Struct/union- assignment
   is clearly in this category and passing structs/unions as function
   arguments is essentially the same. (Passing the argument amounts to
   copying the struct/union into the object that gets allocated for
   the corresponding formal parameter.)  The only exception seems to
   come from struct/union return values, but C compilers tend to
   implement this by allocating a new (unnamed) struct object for
   holding the return value, so that struct/union return also amounts
   to copying into struct/union objects.

   For these reasons (and to avoid having to implement a struct/union
   value type), this FFI treats struct/union types as second-class
   types and provides copy operations separately.  The treatment of
   function calls involving struct/union types has already been
   described above.

   On the ML side, each struct/union type is implemented as an
   abstract data type.  The type definition as well as operations over
   objects involving this type are generated by the ML-NLFFIGEN tool.

   Consider once again our introductory example:

             struct node {
                int i;
                struct node * next;
             };

   The ML-side equivalent to this is an abstract type "s_node su"
   (which will be the phantom type for struct node) and a
   corresponding structure "S_node" that contains operations for this
   type.  (For a union, replace "s_" with "u_" and "S_" with "U_".)

   The signature for S_node generated by ML-NLFFIGEN will be the
   following (note that it makes use of several type abbreviations
   such as su_obj, sint_obj, etc. that are provided by the FFI):

    structure S_node : sig (* struct node *)
        type tag = s_node
        
        (* size for this struct *)
        val size : s_node su S.size
        
        (* RTI for this struct *)
        val typ : s_node T.su_typ
        
        (* witness types for fields *)
        type t_f_i = sint
        type t_f_next = (s_node su, unit, rw) ptr
        
        (* RTI for fields *)
        val typ_f_i : T.sint_typ
        val typ_f_next : ((s_node su, unit, rw) ptr, unit) T.typ
        
        (* field accessors *)
        val f_i : (s_node, 'c) su_obj -> 'c sint_obj
        val f_next :
            (s_node, 'c) su_obj ->
                ((s_node su, unit, rw) ptr, unit, 'c) obj
        
        (* field accessors (lightweight variety) *)
        val f_i' : (s_node, 'c) su_obj' -> 'c sint_obj'
        val f_next' :
            (s_node, 'c) su_obj' ->
                ((s_node su, unit, rw) ptr, unit, 'c) obj'
    end (* structure S_node *)

   We find RTI and size info for the new type, RTI for all the
   field's types, and access methods that map struct objects to
   corresponding field objects.  Access methods are provided both in
   normal and in light-weight form.

   The access method for a field declared "const" maps struct objects
   of arbitrary constness to field objects where 'c is instantiated
   with "ro".  The access method for other fields maps the constness
   for the whole struct object to the constness of the field object.
   The name of an access method is the name of the field prepended
   with "f_" (and followed by "'" in case of the light-weight version).
   The reader can probably infer the other naming conventions from the
   example.

   Bitfields (not shown here) are special because they are not
   first-class values and there are no ordinary objects that hold
   bitfields. This FFI provides separate abstract types for signed and
   unsigned bitfields, and access methods for C bitfields map the
   struct object to such (ML-) bitfields.

   8.1 Equivalence of struct/union types:

   It is not literally true that ML-NLFFIGEN will generate a brand-new
   type for every struct or union it sees.  Instead, it draws from
   another infinite family of abstract "tag types" which has been
   predefined.  (This works in a way similar to Dim.dim.)

   As a result, two separate mentions of struct foo in different C
   source files that belong to the same program will produce ML code
   which still identifies these two struct foos.

   9. Global exports and their types:

   9.1 Global variables:

   Global variables will be represented by a corresponding thunkified
   object.  The thunk's name is the same as the variable's name
   prepended with "g_".
   Examples:

      C                   ML

      int i;              val g_i : unit -> (sint, unit, rw) obj
      const unsigned j;   val g_j : unit -> (uint, unit, ro) obj
      int (**f)(void);    val g_f : unit ->
                             (((unit -> sint) fptr, unit->sint, rw) ptr,
                              unit->sint, rw) obj

   (Fortunately, the types will all be generated by ML-NLFFIGEN, so
   the programmer will not have to write down ugly things like the
   type for f.)

   9.2 Global functions:

   Exported C functions will be represented by three distinct ML
   values:

      1. A thunkified fptr value of corresponding type.  The name of
         the thunk is "fptr_fn_" concatenated with the name of the
         function.
      2. An ML function that takes an argument list similar to the
         fptr in 1., but where those arguments/results that have a
	 corresponding concrete ML representation (in MLRep, via
	 substructure Cvt) have already been translated and
	 light-weight struct/union objects (for passing/returning
	 structs and unios) have been translated to their heavy
	 versions.  The name of the ML function is the name of the C
         function prepended with "fn_".
      3. An ML function like in 2., but with all arguments/results
         that have a light-weight version having been translated to
	 that.  The name of the ML function is the same as that in
         2. but with a trailing apostrophe ("'") added.

   To see the difference between 1. and 2./3., consider a C function
   from int to int.  The ML fptr type would be

         (sint -> sint) fptr

   and calling it via "call" requires an abstract "sint" argument.
   Type "sint" is not equal to its ML representation type
   (MLRep.SInt.int = Int32.int), so in order to pass an ML Int32.int
   value one must apply Cvt.c_sint "by hand".
   (The reason for "sint" not being equal to Int32.int is that the
   representation types for other abstract C types might also be
   Int32.int.  For example, the current implementation uses
   MLRep.SShort.int = Int32.int which would force sint = sshort had the
   C types not been abstract.  But we definitely want to have types
   sint and sshort be distinct!)

   9.4: Persistence of C values:

   C values are transient in that they do not stay valid across
   SMLofNJ.export{ML,Fn} and a restart using the resulting heap
   image.  The only things that stay valid are the thunks for global
   variables and global function pointers.  (Since the generated
   global ML functions that represent global C functions re-invoke
   the function-pointer thunk every time they are called, they also
   stay valid.)

   10. Functorization:

   The ML-NLFFIGEN tool produces a structure containing a functor for
   every C source file it is presented with.  The functor will at least
   take the library argument shown in the example.  However, there are
   case when it requires additional arguments.

   Extra functor arguments are required every time the C source file
   refers to "incomplete pointer types" -- pointers to structs that
   are not declared.

   For example, if the source file mentions "struct foo*" without
   spelling out what "struct foo" is, then the resulting functor will
   take an argument of the form:

      structure I_S_foo : POINTER_TO_INCOMPLETE_TYPE

   That is, the functor argument must be a structure satisfying
   signature POINTER_TO_INCOMPLETE_TYPE.

   There are two ways of obtaining a matching structure for the
   purpose of passing it to the functor:

     1. If the type is to be treated as "abstract", then a fresh
        incomplete pointer type can be obtained by invoking functor
        PointerToIncompleteType (without arguments).
        If the same incomplete type is mentioned in more than one place,
        make sure you generate only one fresh instantiation for it,
        i.e., invoke PointerToIncompleteType only once and pass the
        result to all functors that require it.

     2. If the type is incomplete in one file but gets spelled out in
        another, then one can produce the matching structure from
        that by applying functor PointerToCompleteType to the
        structure S_foo that describes "struct foo".

        Suppose module Bar defines struct foo.  Then we have a
        structure Bar and a functor Bar.BarFn which, when applied,
        would define structure S_foo.  PointerToCompleteType could be
        applied to this structure. However, there is a partial version
        of the same structure (only containing a type definition and
        some RTI) known as Bar.S_foo.  The partial version is
        sufficient for invoking PointerToCompleteType -- which is
        important to break dependency cycles and avoiding the
        chicken-and-egg problem in the case of mutually recursive
        types involving incomplete pointers.

        The main point of using PointerToCompleteType is to let
        client code "see" that 'c I_S_foo.iptr is the same as
        (s_foo su, unit, 'c) ptr.

        Client code that must be written without the benefit of having
        access to the real definition of "struct foo" but which must
        leave open the possibility of interacting with other code that
        does must itself be functorized (leaving the instantiation of
        I_S_foo to _its_ clients.)

----------------------------------------------------------------

Invoking ml-nlffigen:

The ML-NLFFIGEN tool is a stand-alone program ml-nlffigen which can be
invoked from the shell command line. It takes one mandatory argument
<cfile> which is the file name of the C code that describes the
interface to be implemented.

The mandatory argument can be preceeded by any combination of the
following options:

  -sigfile <file>                 name of the signature file to be generated
                                  (default: <cfile>.sig)
  -strfile <file>                 name of the structure file to be generated
                                  (default: <cfile>.sml)
  -cmfile  <file>                 name of the .cm-file to be generated
                                  (default: <cfile>.cm;
                                   This is the file that needs to be mentioned
                                   in the client .cm-file.  See node.h.cm
                                   vs. node.cm in our example.)
  -signame <name>                 name of the signature to be generated
                                  (The default is obtained by taking <cfile>,
                                   stripping the extension, capitalizing
                                   all letters, and turning embedded dots
                                   and dashes into underscores.
                                   Example: f.oo-bar.h --> F_OO_BAR)
  -strname <name>                 name of the structure to be generated
                                  If the structure's name is <foo>, then
                                  the name of the functor contained therein
                                  will be <foo>Fn.
                                  (The default is obtained by taking <cfile>,
                                   stripping the extension, dividing the
                                   remainder into sections at dot- and 
                                   dash-boundaries, capitalizing the first
                                   letter of each section, and then joining
                                   them.
                                   Example: foo-bar.h --> FooBar)
  -allSU                          Normally the tool will treat all
                                  struct or union definitions that are
                                  not spelled out in <cfile> as
                                  incomplete (even if <cfile> includes
                                  a header file that spells them out).
                                  This flag will force ml-ffigen to follow
                                  treat included header files the same
                                  as <cfile>.
                                  (Structs and unions whose tags start with
                                   an underscore are _always_ treated
                                   incomplete.)
  -width                          Target text width for pretty-printing
                                  ML code.  The pretty-printer occasionally
                                  overruns this limit, though.
  -lambdasplit <arg>              places "(lambdasplit:<arg>)" after
                                  the names of ML source files in the
                                  generated .cm file. (This controls
                                  the cross-module inlining machinery
                                  of the SML/NJ compiler.)

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0