Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] View of /sml/trunk/src/system/README
ViewVC logotype

View of /sml/trunk/src/system/README

Parent Directory Parent Directory | Revision Log Revision Log


Revision 645 - (download) (annotate)
Mon May 15 07:17:30 2000 UTC (19 years, 3 months ago) by blume
File size: 30901 byte(s)
merging changes from devel branch; see HISTORY
Compiler Hacker's Guide to the new CM...
========================================

Last change: 12/5/2000

* Libraries
-----------

The new way of building the compiler is heavily library-oriented.
Aside from a tiny portion of code that is responsible for defining the
pervasive environment, _everything_ lives in libraries.  Building the
compiler means compiling and stabilizing these libraries first.  Some
of the libraries exist just for reasons of organizing the code, the
other ones are potentially useful in their own right.  Therefore, as a
beneficial side-effect of compiling the compiler, you will end up with
stable versions of these libraries.

At the moment, the following libraries are constructed when compiling
the compiler ("*" means that I consider the library potentially useful
in its own right):

  * $/basis.cm				SML'97 Basis Library (pre-loaded)
  * $/smlnj-lib.cm			SML/NJ Utility Library
  * $/html-lib.cm			SML/NJ HTML Library
  * $/pp-lib.cm				SML/NJ Pretty-print Library

  * $/ml-yacc-lib.cm			SML/NJ ML-Yacc runtime library

  * $smlnj/compiler/{alpha32,hppa,ppc,sparc,x86}.cm
					cross-compiler libraries, exporting
					structure <Arch>Compiler
  * $smlnj/compiler/current.cm		structure Compiler (current arch)
  * $smlnj/compiler/all.cm		all cross-compilers and all cross-CMBs

  * $smlnj/cm/minimal.cm		minimal CM (pre-loaded)
  * $smlnj/cm/full.cm			full structure CM (see manual)
  * $smlnj/cm/tools.cm			CM tools library

  * $smlnj/cmb/{alpha32,hppa,ppc,sparc,x86}-unix.cm
					cross-bootstrap-compilers for Unix
					(structure <Arch>UnixCMB)
  * $smlnj/cmb/ppc-macos.cm		...for Mac (structure PPCMacosCMB)
  * $smlnj/cmb/x86-win32.cm		...for Windoze (structure X86Win32CMB)
  * $smlnj/cmb/current.cm		structure CMB (current arch/os)

  * $smlnj/compiler.cm			abbrev. for $smlnj/compiler/current.cm
  * $smlnj/cm.cm			abbrev. for $smlnj/cm/full.cm
  * $smlnj/cmb.cm			abbrev. for $smlnj/cmb/current.cm

  * $comp-lib.cm			Utility library for compiler

  - $smlnj/viscomp/core.cm		Compiler core functionality
  - $smlnj/viscomp/{alpha32,hppa,ppc,sparc,x86}.cm
					Machine-specific parts of compiler

  - $smlnj/internal/{intsys,cm-lib,cm-hook,host-compiler-0}.cm
					Glue that holds the interactive system
					together

  * $MLRISC/{MLRISC,Control,Lib,ALPHA,HPPA,PPC,SPARC,IA32}.cm
					Various MLRISC bits
					(Other MLRISC libraries such as
					 Graph, Visual, etc. do not currently
					 take part in the SML/NJ build.)

  * ${mlyacc,mllex,mlburg}-tool.cm	CM plug-in libraries for common tools
  * ${grm,lex,burg}-ext.cm		CM plug-in libraries for common file
					extensions

Paths of the form $/foo/... are shorthands for $foo/foo/..., paths of
the form $/singlearc can also be written as $singlearc (e.g.,
$basis.cm instead of $/basis.cm).  The usefulness of the latter
shorthand is controversial, so you should probably try to avoid it.

A more complete explanation of the $-notation can be found later in
this document or in the CM manual.

To learn about the definitions of the $-anchors (and, thus, where in
the source tree the above libraries are defined), consult the
"pathconfig" file here in this directory.

* Before you can use the bootstrap compiler (CMB)...
----------------------------------------------------

To be able to use CMB at all, you must first say

	CM.autoload "$smlnj/cmb.cm";

after you start sml.  Alternatively -- and perhaps more conveniently --
you can provide "$smlnj/cmb.cm" as a command-line argument to sml:

      $ sml '$smlnj/cmb.cm'

(Be sure to protect the dollar symbol which usually has its own
special meaning to the shell.)

* Compiling the compiler
------------------------

We are now back to the old scheme where a call to CMB.make() suffices to
build a bootable set of files (libraries in our case).  CMB.make maintains
two parallel hierarchies of derived files:

  1. the binfile hierarchy ("binfiles"), containing compiled objects for
     each individual ML source file; this hierarchy is rooted at
      <prefix>.bin.<arch>-<opsys>
  2. the stable library hierarchy ("boot files"), containing library files
     for each library that participates in building SML/NJ; this hierarchy
     is rooted at
      <prefix>.boot.<arch>-<opsys>

The default for <prefix> is "sml".  It can be changed by using
CMB.make' with the new <prefix> as the optional string argument.

CMB.make reuses existing bootfiles after it has verified that they are
consistent with their corresponding binfiles.  Bootfiles do not need
to be deleted in order for CMB.make to work correctly.

To bootstrap a new system (using the runtime system boot loader), the
bootfiles _must_ be present, the binfiles need not be present (but
their presence does not hurt either).

You can reduce the number of extra files compiled and stabilized
during CMB.make at the expense of not building any cross-compilers.
For that, say
	#set (CMB.symval "LIGHT") (SOME 1);
before running CMB.make.

* Making the heap image
-----------------------

The heap image is made by running the "makeml" script that you find
here in this directory.  By default it will try to refer to the
sml.boot.<arch>-<os> directory.  You can change this using the -boot
argument (which takes the full name of the boot directory to be used).

The "feel" of using makeml should be mostly as it used to.  However,
internally, there are some changes that you should be aware of:

1. The script will make a heap image and build a separate library directory
   that contains (hard) links to the library files in the bootfile directory.

2. There is no "-full" option anymore.  This functionality should
   eventually be provided by a library with a sufficiently rich export
   interface.

3. No image will be generated if you use the -rebuild option.
   Instead, the script quits after making new bin and new boot
   directories.  You must re-invoke makeml with a suitable "-boot"
   option to actually make the image.  The argument to "-rebuild"
   is the <prefix> for the new bin and boot directories (see above).

   [Note: When the -rebuild option is specified, then the boot procedure
    will not read static environments from the boot directory.  Instead,
    after the ML code has been loaded and linked, the system will invoke
    CMB.make' with the argument that was given to -rebuild.  After
    CMB.make' is done, the system quits.  In essence, makeml with -rebuild
    acts as a bootstrap compiler that is not dependent on any usable
    static environments.]

Makeml will not destroy the bootfile directory.

* Testing a newly generated heap image
--------------------------------------

If you use a new heap image by saying "sml @SMLload=..." then things
will not go as you may expect because along with the new heap image
should go those new stable libraries, but unless you do something
about it, the newly booted system will look for its stable libraries
in places where you stored your _old_ stable libraries.  (After just
having done "makeml", these "places" would be within the boot file
hierarchy under <prefix>.boot.<arch>-<os>.)

After you have made the new heap image, the new libraries are in a
separate directory whose name is derived from the name of the heap
image.  (Actually, only the directory hierachy is separate, the
library files themselves are hard links.)  The "testml" script that
you also find here will run the heap image and instruct it to look for
its libraries in that new library directory by setting the
CM_PATHCONFIG environment variable to point to a different pathconfig
file under <prefix>.lib.

"testml" takes the <prefix> of the heap image as its first
argument. All other arguments are passed verbatim to the ML process.

The <prefix> is the same as the one used when you did "makeml".  If
you run "testml" without arguments, <prefix> defaults to "sml".
Thus, if you just said "makeml" without arguments you can also say
"testml" without arguments.  (Note that you _must_ supply the <prefix>
argument if you intend to pass any additional arguments.)

* Installing a heap image for more permanent use
------------------------------------------------

You can "install" a newly generated heap image by replacing the old
image with the new one AND AT THE SAME TIME replacing the old stable
libaries with the new ones.  To do this, run the "installml" script.

Like "testml", "installml" also expects the <prefix> as its first
argument.  <prefix> defaults to "sml" if no argument is specified.

"installml" patches the ../../lib/pathconfig file to reflect any
changes or additions to the path name mapping.  (I say "patches"
because entries unrelated to the SML/NJ build process are retained in
their original form.)  If you want to use a destination directory that
is different from ../../lib, then you must do this by hand (i.e.,
installml does not have an option for that).

Thus, after a successful CMB.make, you should say

	./makeml

to make the new heap image + libraries, then

	./testml

to make sure everything works, and finally

	./installml

to replace your existing compiler with the one you just built and tested.

* Cross-compiling
-----------------

All cross-compilers live in the "$smlnj/compiler/all.cm" library.
(The source tree for the "$smlnj" anchor -- see "pathconfig" -- is
src/system/smlnj, but this should normally not concern you.)
You must first say

	CM.autoload "$smlnj/compiler/all.cm";

before you can access them.  (This step corresponds to the old
CMB.retarget call.)  After that, _all_ cross-compilers are available
at the same time.  However, the ones that you are not using don't take
up any undue space because they only get loaded once you actually
mention them at top level.  The names of the structures currently
exported by $smlnj/compiler/all.cm are:

	structure Alpha32UnixCMB
	structure HppaUnixCMB
	structure PPCMacOSCMB
	structure PPCUnixCMB
	structure SparcUnixCMB
	structure X86UnixCMB
	structure X86Win32CMB

	structure Alpha32Compiler
	structure HppaCompiler
	structure PPCCompiler
	structure SparcCompiler
	structure X86Compiler

(PPCMacOSCMB is not very useful at the moment because there is no
implementation of the basis library for the MacOS.)

Alternatively, you can select just the one single structure that you
are interested in by auto-loading $smlnj/compiler/<arch>.cm or
$smlnj/cmb/<arch>-<os>.cm.
<arch> currently ranges over "alpha32", "hppa", "ppc", "sparc", and "x86.
<os> can be either "unix" or "macos" or "win32".
(Obviously, not all combinations are valid.)

Again, as with $smlnj/cmb.cm, you can specify the .cm file as an
argument to the sml command:

    $ sml '$smlnj/compiler/all.cm'

or

    $ sml '$smlnj/cmb/alpha32-unix.cm'

[Note: The command line for the "sml" command accepts configuration
parameters of the form "@SMLxxx...", mode switches of the form "-m"
and "-a", names of ML files -- which are passed to "use" -- and
arguments suitable for CM.make or CM.autoload.  CM.autoload is the
default; the "-m" and "-a" mode switches can be used to change the
default -- even several times within the same command line.
A single argument "@CMslave" is also accepted, but it should not be
used directly as it is intended for use by the parallel compilation
facility within CM.]

* Path configuration
--------------------

+ Basics:

One of the new features of CM is its handling of path names.  In the
old CM, one particular point of trouble was the autoloader.  It
analyzes a group or library and remembers the locations of associated
files.  Later, when the necessity arises, those files will be read.
Therefore, one was asking for trouble if the current working directory
was changed between analysis- and load-time, or, worse, if files
actually moved about (as is often the case if build- and
installation-directories are different, or, to put it more generally,
if CM's state is frozen into a heap image and used in a different
environment).

Maybe it would have been possible to work around most of these
problems by fixing the path-lookup mechanism in the old CM and using
it extensively.  But path-lookup (as in the Unix-shell's "PATH") is
inherently dangerous because one can never be too sure what it will be
that is found on the path.  A new file in one of the directories early
in the path can upset the program that hopes to find something under
the same name later on the path.  Even when ignoring security-issues
like trojan horses and such, this definitely opens the door for
various unpleasant surprises.  (Who has never named a test version
of a program "test" an found that it acts strangely only to discover
later that /bin/test was run instead?)

Thus, the new scheme used by CM is a fixed mapping of what I call
"configuration anchors" to corresponding directories.  The mapping can
be changed, but one must do so explicitly.  In effect, it does not
depend on the contents of the file system.  Here is how it works:

If I specify a pathname that starts with a "$", then the first arc
between "$" and the first "/" is taken as the name of a so-called
"anchor".  CM knows a mapping from anchor names to directory names and
replaces the prefix $<anchor> with the name of the corresponding
directory.  Therefore, an anchored path has the general form

   $<anchor>/<path>

It is important that there is at least one arc in <path>.  In other
words, the form $<anchor> is NOT valid.  (Actually, it currently is
valid, but CM interprets it in a different way.)

Examples:

   $smlnj/compiler/all.cm
   $basis.cm/basis.cm
   $MLRISC/Control.cm

The special case where <anchor> coincides with the first arc of <path>
can be abbreviated by ommitting <anchor>.  This leads to the shorthand

  $/<anchor>/<more>...

for the longer

  $<anchor>/<anchor>/<more>...

Examples:

  $/foo/bar/baz.cm      (* same as $foo/foo/bar/baz.cm *)
  $/basis.cm            (* same as $basis.cm/basis.cm *)

Currently, CM accepts one additional shorthand for the case where
<path> has precisely one arc that coincides with <anchor>.  Here, the
slash "/" can be ommitted, too.

Examples:

  $basis.cm             (* same as $/basis.cm or $basis.cm/basis.cm *)
  $nw-ext.cm            (* same as $/nw-ext.cm or $nw-ext.cm/nw-ext.cm *)

Previously, CM used "implicit" anchors where anchored paths simply
have the form

   <anchor>/<more>...

The distinction between anchored paths and relative paths was made
based on whether or not <anchor> had a known mapping at the time it
was seen by CM.  Since this is hard to read and fragile, support for
implicit anchors (while still there) is considered obsolete and will
be faded out soon.  The meaning of an implicitly anchored path <path>
is the same as $/<path>.

Recognition of implicit anchors can be turned off by issuing the
following command:

   CM.autoload "$smlnj/cm.cm";
   #set CM.Control.implicit_anchors false;

+ Why anchored paths?

The important point is that one can change the mapping of the anchor,
and the tranlation of the (anchored) path name will also change
accordingly -- even very late in the game.  CM avoids "elaborating"
path names until it really needs them when it is time to open files.
CM is also willing to re-elaborate the same names if there is reason
to do so. Thus, the "basis.cm" library that was analyzed "here" but
then moved "there" will also be found "there" if the anchor has been
re-set accordingly.

The anchor mapping is (re-)initialized at startup time by reading two
configuration files.  Normally, those are the "../../lib/pathconfig"
file and the ".smlnj-pathconfig" file in your home directory (if such
exists).  During an ongoing session, function CM.Anchor.anchor can be
used to query and modify the anchor mapping.

+ Different configurations at different times:

During compilation of the compiler, CMB uses a path configuration that
is read from the file "pathconfig" located here in this directory.

At bootstrap time (while running "makeml"), the same anchors are
mapped to the corresponding sub-directory of the "boot" directory:
basis.cm is mapped to sml.boot.<arch>-<os>/basis.cm -- which means
that CM will look for a library named
sml.boot.<arch>-<os>/basis.cm/basis.cm -- and so forth.

[Note, there are some anchors in "pathconfig" that have no
corresponding sub-directory of the boot director.  Examples are
"root.cm", "cm", and so on.  The reason is that there are no stable
libraries whose description files are named using these anchors;
everything anchored at "$cm" is a group but not a library.]

By the way, you will perhaps notice that there is no file
	sml.boot.<arch>-<os>/basis.cm/basis.cm
but there _is_ the corresponding stable archive
	sml.boot.<arch>-<os>/basis.cm/CM/<arch>-<os>/basis.cm
CM always looks for stable archives first.

This mapping (from anchors to names in the boot directory) is the one
that will get frozen into the generated heap image at boot time.
Thus, unless it is changed, CM will look for its libraries in the boot
directory.  The aforementioned "testml" script will make sure (by
setting the environment variable CM_PATHCONFIG) that the mapping be
changed to the one specified in a new "pathconfig" file which was
created by makeml and placed into the test library directory.  It
points all anchors to the corresponding entry in the test library
directory.  Thus, "testml" will let a new heap image run with its
corresponding new libraries.

Normally, however, CM consults other pathconfig files at startup --
files that live in standard locations.  These files are used to modify
the path configuration to let anchors point to their "usual" places.
The names of the files that are read (if present) are configurable via
environment variables.  At the moment they default to
	/usr/lib/smlnj-pathconfig
and
	$HOME/.smlnj-pathconfig
The first one is configurable via CM_PATHCONFIG (and the default is
configurable at boot time via CM_PATHCONFIG_DEFAULT); the last is
configurable via CM_LOCAL_PATHCONFIG and CM_LOCAL_PATHCONFIG_DEFAULT.
In fact, the makeml script sets the CM_PATHCONFIG_DEFAULT variable
before making the heap image.  Therefore, heap images generated by
makeml will look for their global pathconfig file in

	../../lib/pathconfig

[Note: The "makeml" script will not re-set the CM_PATHCONFIG_DEFAULT
variable if it was already set before.  If it does re-set the
variable, it uses an absolute path name instead of the relative path
that I used for illustration above.]

For example, I always keep my "good" libraries in `pwd`/../../lib --
where both the main "install" script (in config/install.sh) and the
"installml" script (see above) also put them -- so I don't have to do
anything special about my pathconfig file.

Once I have new heap image and libraries working, I replace the old
"good" image with the new one:

  mv <image>.<arch>-<osvariant> ../../bin/.heap/sml.<arch>-<osvariant>

After this I must also move all libraries from <image>.libs/* to their
corresponding position in ../../lib.

Since this is cumbersome to do by hand, there is a script called
"installml" that automates this task.  Using the script has the added
advantage that it will not clobber libraries that belong to other than
the current architecture.  (A rather heavy-handed "rm/mv" approach
will delete all stable libraries for all architectures.)
"installml" also patches the ../../lib/pathconfig file as necessary.

Of course, you can organize things differently for yourself -- the
path configuration mechanism should be sufficiently flexible.  If you
do so, you will have to set CM_PATHCONFIG.  This must be done before
you start sml.  If you want to change the pathname mapping at the time
sml is already running, then use the functions in CM.Anchor.

* Libraries vs. Groups
----------------------

With the old CM, "group" was the primary concept while "library" and
"stabilization" could be considered afterthoughts.  This has changed.
Now "library" is the primary concept, "stabilization" is semantically
significant, and "groups" are a secondary mechanism.

Libraries are used to "structure the world"; groups are used to give
structure to libraries.  Each group can be used either in precisely
one library (in which case it cannot be used at the interactive
toplevel) or at the toplevel (in which case it cannot be used in any
library).  In other words, if you count the toplevel as a library,
then each group has a unique "owner" library.  Of course, there still
is no limit on how many times a group can be mentioned as a member of
other groups -- as long as all these other groups belong to the same
owner library.

Normally, collections of files that belong together should be made
into proper CM libraries.  CM groups (aka "library components") should
be used only when there are namespace problems within a library.

Aside from the fact that I find this design quite natural, there is
actually a technical reason for it: when you stabilize a library
(groups cannot be stabilized), then all its sub-groups (not
sub-libraries!)  get "sucked into" the stable archive of the library.
In other words, even if you have n+1 CM description files (1 for the
library, n for n sub-groups), there will be just one file representing
the one stable archive (per architecture/os) for the whole thing.  For
example, I structured the standard basis into one library with two
sub-groups, but once you compile it (CMB.make) there is only one
stable file that represents the whole basis library.  If groups were
allowed to appear in more than one library, then stabilization would
duplicate the group (its code, its environment data structures, and
even its dynamic state).

There is a small change to the syntax of group description files: they
must explicitly state which library they belong to. CM will verify
that.  The owner library is specified in parentheses after the "group"
keyword.  If the specification is missing (that's the "old" syntax),
then the the owner will be taken to be the interactive toplevel.

* Pervasive environment, core environment, the init library "init.cmi"
-------------------------------------------------------------------------

CMB.make starts out by building and compiling the
"init library".  This library cannot be described in the "usual" way
because it uses "magic" in three ways:
 - it is used to later tie in the runtime system
 - it binds the "_Core" structure
 - it exports the "pervasive" environment

The pervasive environment no longer includes the entire basis library
but only non-modular bindings (top-level bindings of variables and
types).

CM cannot automatically determine dependencies (or exports) for the
init library source files, but it still does use its regular cutoff
recompilation mechanism.  Therefore, dependencies must be given
explicitly.  This is done by a special description file which
currently lives in smlnj/init/init.cmi (as an anchored path:
"$smlnj/init/init.cmi").  See the long comment at the beginning of
that file for more details.

After it is built, $smlnj/init/init.cmi can be used as an "ordinary"
library by other libraries.  (This is done, for example, by the
implementation of the Basis library.)  Access to
"$smlnj/init/init.cmi" is protected by the privilege named
"primitive".  Also, note that the .cmi-file is not automatically
recognized as as CM description file. ("cmi" should remind you of "CM
- Initial library".)  Therefore, it must be given an explicit member
class:

     $smlnj/init/init.cmi : cm

* Autoloader
------------

The new system heavily relies on the autoloader.  As a result, almost
no static environments need to get unpickled at bootstrap time.  The
construction of such environments is deferred until they become
necessary.  Thanks to this, it was possible to reduce the size of the
heap image by more than one megabyte (depending on the architecture).
The downside (although not really terribly bad) is that there is a
short wait when you first touch an identifier that hasn't been touched
before.  (I acknowledge that the notion of "short" may depend on your
sense of urgency. :-)

The reliance on the autoloader (and therefore CM's library mechanism)
means that in order to be able to use the system, your paths must be
properly configured.

Two libraries get pre-registered at bootstap time: the basis library
("$/basis.cm") and CM itself ("$smlnj/cm/minimal.cm").  The latter is
crucial: without it one wouldn't be able to register any other
libraries via CM.autoload.  The registration of $/basis.cm is a mere
convenience.

Here are some other useful libraries that are not pre-registered but
which can easily be made accessible via CM.autoload (or, non-lazily,
via CM.make):

	$smlnj/cm.cm		- provides the actual ("full") structure CM
				  as described in the CM manual
        $smlnj/cm/full.cm	- same as $smlnj/cm.cm
	$smlnj/compiler.cm	- provides "structure Compiler"
        $smlnj/compiler/current.cm - same as $smlnj/compiler.cm
	$smlnj/cmb.cm		- provides "structure CMB"
	$smlnj/cmb/current.cm	- same as $smlnj/cmb.cm
	$smlnj/compiler/all.cm	- provides "structure <Arch>Compiler" and
				  "structure <Arch><OS>CMB" for various
				  values of <Arch> and <OS>
	$smlnj-lib.cm		- the SML/NJ library

[Note: The fact that $smlnj/compiler.cm is not among the
pre-registered libraries seems like an oversight and could lead to
some inconveniences to users who want to, for example, set compiler
flags.  However, pre-registration of this library significantly
increases the size of the heap image.  Moreover, since the library can
easily be loaded by giving the string as a command line argument, this
does not really appear to be a big burden to me.  Just create a shell
alias or a little wrapper script if you think you really need this.]

* Internal sharing
------------------

Dynamic values of loaded modules are shared.  This is true even for
those modules that are used by the interactive compiler itself.  If
you load a module from a library that is also used by the interactive
compiler, then "loading" means "loading the static environmnent" -- it
does not mean "loading the code and linking it".  Instead, you get to
share the compiler's dynamic values (and therefore the executable
code as well).

Of course, if you load a module that hasn't been loaded before and
also isn't used by the interactive system, then CM will get the code
and link (execute) it.

* Access control
----------------

In some places, you will find that the "group" and "library" keywords
in description files are preceeded by certain strings, sometimes in
parentheses.  These strings are the names of "privileges".  Don't
worry about them too much at the moment.  For the time being, access
control is not enforced, but the infrastructure is in place.

* Preprocessor
--------------

The syntax of expressions in #if and #elif clauses is now more ML-ish
instead of C-ish.  (Hey, this is ML after all!)  In particular, you
must use "andalso", "orelse", and "not" instead of "&&", "||" and "!".
Unary minus is "~".

A more interesting change is that you can now query the exports of
sources/subgroups/sublibraries:

  - Within the "members" section of the description (i.e., after "is"):
    The expression
	defined(<namespace> <name>)
    is true if any of the included members preceeding this clause exports
    a symbol "<namespace> <name>".
  - Within the "exports" section of the description (i.e., before "is"):
    The same expression is true if _any_ of the members exports the
    named symbol.
    (It would be more logical if the exports section would follow the
     members section, but for esthetic reasons I prefer the exports
     section to come first.)

Example:

	+--------------------------+
	|Library		   |
	|  structure Foo	   |
	|#if defined(structure Bar)|
	|  structure Bar	   |
	|#endif			   |
	|is			   |
	|#if SMLNJ_VERSION > 110   |
	|  new-foo.sml		   |
	|#else			   |
	|  old-foo.sml		   |
	|#endif			   |
	|#if defined(structure Bar)|
	|  bar-client.sml	   |
	|#else			   |
	|  no-bar-so-far.sml	   |
	|#endif			   |
	+--------------------------+

Here, the file "bar-client.sml" gets included if SMLNJ_VERSION is
greater than 110 and new-foo.sml exports a structure Bar _or_ if
SMLNJ_VERSION <= 110 and old-foo.sml exports structure Bar. Otherwise
"no-bar-so-far.sml" gets included instead.  In addition, the export of
structure Bar is guarded by its own existence.  (Structure Bar could
also be defined by "no-bar-so-far.sml" in which case it would get
exported regardless of the outcome of the other "defined" test.)

Some things to note:

  - For the purpose of the pre-processor, order among members is
    significant.  (For the purpose of dependency analysis, order continues
    to be not significant).
  - As a consequence, in some cases pre-processor dependencies and
    compilation-dependencies may end up to be opposites of each other.
    (This is not a problem; it may very well be a feature.)

* The Basis Library is no longer built-in
-----------------------------------------

The SML'97 basis is no longer built-in.  If you want to use it, you
must specify "basis.cm" as a member of your group/library.

* No more aliases
-----------------

The "alias" feature is no longer with us.  At first I thought I could
keep it, but it turns out that it causes some fairly fundamental
problems with the autoloader.  However, I don't think that this is a
big loss because path anchors make up for most of it.  Moreover,
stable libraries can now easily be moved to convenient locations
without having to move large source trees at the same time. (See my
new config/install.sh script for examples of that.)

It is possible to simulate aliases (in a way that is safer than the
original alias mechanism).  For example, the root.cm file (which is the
root of the whole system as far as CMB.make is concerned) acts as an
alias for $smlnj/internal/intsys.cm.  In this case, root.cm is a group
to avoid having a (trivial) stable library file built for it.

A library can act as an "alias" for another library if it has a
verbatim copy of the export list and mentions the other library as its
only member.  Examples for this are $smlnj/cm.cm (for
$smlnj/cm/full.cm), $smlnj/compiler.cm (for $smlnj/compiler/current.cm),
etc.  The stable library file for such an "alias" is typically very
small because it basically just points to the other library.  (For
example, the file representing $smlnj/cm.cm is currently 234 bytes
long.)

* Don't use relative or absolute pathnames to refer to libraries
----------------------------------------------------------------

Don't use relative or absolute pathnames to refer to libraries.  If
you do it anyway, you'll get an appropriate warning at the time when
you do CMB.make().  If you use relative or absolute pathnames to
refer to library B from library A, you will be committed to keeping B
in the same relative (to A) or absolute location.  This, clearly,
would be undesirable in many situations (although perhaps not always).

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0