Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/src/cm/Doc/manual.tex
ViewVC logotype

Diff of /sml/trunk/src/cm/Doc/manual.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 414, Fri Sep 3 00:30:34 1999 UTC revision 415, Fri Sep 3 08:15:09 1999 UTC
# Line 19  Line 19 
19  \setlength{\parindent}{0pt}  \setlength{\parindent}{0pt}
20  \setlength{\parskip}{6pt plus 3pt minus 2pt}  \setlength{\parskip}{6pt plus 3pt minus 2pt}
21    
22    \newcommand{\nt}[1]{{\it #1}}
23    \newcommand{\tl}[1]{{\underline{\bf #1}}}
24    \newcommand{\ttl}[1]{{\underline{\tt #1}}}
25    \newenvironment{syntax}{\begin{tabbing} xyzwww \=\kill}{\end{tabbing}}
26    \newcommand{\ar}{$\rightarrow$\ }
27    \newcommand{\vb}{~$|$~}
28    
29  \begin{document}  \begin{document}
30    
31  \bibliographystyle{alpha}  \bibliographystyle{alpha}
# Line 44  Line 51 
51  the programmer from having to specify a detailed module dependency  the programmer from having to specify a detailed module dependency
52  graph by hand~\cite{blume:depend99}.  graph by hand~\cite{blume:depend99}.
53    
54  The most important change with repect to the previous (``old'')  This new version of CM emphasis on {\em working with libraries}.  This
55  implementation of CM is a change of emphasis.  Until now the focus was  contrasts with the previous implementation where the focus was on
56  on compilation management while libraries were added as an  compilation management while libraries were added as an afterthought.
57  afterthought.  Beginning now, CM takes a very library-centric view of  Beginning now, CM takes a very library-centric view of the world.  In
58  the world.  In fact, the implementation of SML/NJ itself has been  fact, the implementation of SML/NJ itself has been restructured to
59  restructured to meet this approach.  meet this approach.
60    
61  \section{The CM model}  \section{The CM model}
62    
63  When working with CM, the most important concept is the concept of a  A CM library is a collection of ML source files and references to
64  {\em library}.  A library is a collection of ML source files and  other libraries together with an explicit export interface.  The
65  references to other libraries together with an explicit export  export interface lists all toplevel-defined symbols of the library
66  interface.  The export interface lists all toplevel-defined symbols of  that shall be exported to its clients.  A library is described by the
67  the library that shall be exported to its clients.  A library is  contents of its {\em description file}.
 described by its {\em description file}.  
68    
69  \noindent Example:  \noindent Example:
70    
# Line 87  Line 93 
93  \subsection{Library descriptions}  \subsection{Library descriptions}
94    
95  Members of a library do not have to be listed in any particular order  Members of a library do not have to be listed in any particular order
96  since CM will automatically calculate the dependency graph.  Three  since CM will automatically calculate the dependency graph.  Some
97  minor restrictions on the source language are necessary to make this  minor restrictions on the source language are necessary to make this
98  work:  work:
99  \begin{enumerate}  \begin{enumerate}
# Line 123  Line 129 
129  definition listed in a (sub-)library's export list.  definition listed in a (sub-)library's export list.
130    
131  If a library is structured into library components using {\em groups}  If a library is structured into library components using {\em groups}
132  (see Section~\ref{sec:groups}), then each component (group) is treated  (see Section~\ref{sec:groups}), then---as far as name visibility is
133  like a separate library.  concerned---each component (group) is treated like a separate library.
134    
135  Dependencies among libraries, library components, or ML source files  Cyclic dependencies among libraries, library components, or ML source
136  within a library are detected and flagged as errors.  files within a library are detected and flagged as errors.
137    
138  \subsection{Groups}  \subsection{Groups}
139  \label{sec:groups}  \label{sec:groups}
# Line 138  Line 144 
144  involves the use of ``administrative'' libraries whose sole purpose is  involves the use of ``administrative'' libraries whose sole purpose is
145  to rename certain definitions.  to rename certain definitions.
146    
147  However, under CM, ``library'' does not only refer to a concept but  However, under CM, ``library'' does not only refer to namespace
148  often also to an actual file system object.  It would be inconvenient  management but often also to an actual file system object.  It would
149  if name resolution problems would result in a proliferation of  be inconvenient if name resolution problems would result in a
150  additional library files.  Therefore, CM also provides the notion of  proliferation of additional library files.  Therefore, CM also
151  groups (or: library components).  Name resolution for groups works  provides the notion of groups (or: library components).  Name
152  like name resolution for entire libraries, but grouping is entirely  resolution for groups works like name resolution for entire libraries,
153  internal to each library.  but grouping is entirely internal to each library.
154    
155  During development, each group has its own description file which well  During development, each group has its own description file which will
156  be referred to by the surrounding library or other components thereof.  be referred to by the surrounding library or other components thereof.
157  The syntax of group description files is the same as that of library  The syntax of group description files is the same as that of library
158  description files with the following exceptions:  description files with the following exceptions:
# Line 198  Line 204 
204  at most once.  at most once.
205  \item Libraries can be referred to freely from as many other groups or  \item Libraries can be referred to freely from as many other groups or
206  libraries as the programmer desires.  libraries as the programmer desires.
207  \item Each group cannot be used from outside the (uniquely defined)  \item A group cannot be used from outside the (uniquely defined)
208  library that it is a component of.  However, within that library it  library that it is a component of.  However, within that library it
209  can be referred to from arbitrarily many other groups.  can be referred to from arbitrarily many other groups.
210  \item The same ML source file cannot appear more than once.  If an ML  \item The same ML source file cannot appear more than once.  If an ML
211  source file is to be referred to by multiple clients it must first be  source file is to be referred to by multiple clients it must first be
212  ``wrapped'' into a library or (if sufficient) a group.  ``wrapped'' into a library (or---if that's sufficient---a group).
213  \end{itemize}  \end{itemize}
214    
215  \subsection{Top-level groups}  \subsection{Top-level groups}
# Line 212  Line 218 
218  allows groups to appear at top level, i.e., outside of any library.  allows groups to appear at top level, i.e., outside of any library.
219  Such groups must omit the parenthetical library specification and then  Such groups must omit the parenthetical library specification and then
220  cannot also be used within libraries. One could think of the top level  cannot also be used within libraries. One could think of the top level
221  itself as a ``virtual unnamed library'' whose components top-level  itself as a ``virtual unnamed library''.  Top-level groups are then
222  groups are.  components of this virtual library.
223    
224  \section{Naming objects in the file system}  \section{Naming objects in the file system}
225    
226  \subsection{Motivation}  \subsection{Motivation}
227    
228  File naming has been an area notorious for its problems and cause for  File naming has been an area notorious for its problems and cause of
229  most of the gripes from CM's users.  With this in mind, CM now takes a  most of the gripes from CM's users.  With this in mind, CM now takes a
230  different approach to file name resolution.  different approach to file name resolution.
231    
# Line 251  Line 257 
257  that use path names in native syntax are not portable across operating  that use path names in native syntax are not portable across operating
258  systems.  systems.
259    
260  Absolute pathnames are resolved in the usual operating-specific  Absolute pathnames are resolved in the usual manner specific to the
261  manner.  However, it is advisable to avoid absolute pathnames because  operating system.  However, it is advisable to avoid absolute
262  they are certain to ``break'' if the corresponding file moves to a  pathnames because they are certain to ``break'' if the corresponding
263  different location.  file moves to a different location.
264    
265  The resolution of relative pathnames is more complicated.  The resolution of relative pathnames is more complicated:
266    
267  \begin{itemize}  \begin{itemize}
268  \item If the first component of a relative pathname is a  \item If the first component of a relative pathname is a
# Line 276  Line 282 
282  \item If a non-anchored relative path is entered interactively, for  \item If a non-anchored relative path is entered interactively, for
283  example as an argument to one of CM's interface functions, then it  example as an argument to one of CM's interface functions, then it
284  will be resolved in the OS-specific manner, i.e., relative to the  will be resolved in the OS-specific manner, i.e., relative to the
285  current working directory.  However, CM will remember what that  current working directory.  However, CM will internally represent the
286  directory is at the time the name was first seen.  Should the working  name in such a way that it remembers the corresponding working
287  directory change during an ongoing CM session, then CM will switch its  directory.  Should the working directory change during an ongoing CM
288  mode of operation for that name and prepend the name of the original  session while there still is a reference to the name, then CM will
289  working directory.  In effect, the name will continue to refer to the  switch its mode of operation prepend the path of the original working
290  same file system object regardless of what the current working  directory. As a result, two names specified using identical
291  directory is.  strings but with different working directories in effect will be kept
292    distinct and continue to refer to those file system location that they
293    referred to when they were first seen.
294  \end{itemize}  \end{itemize}
295    
296  \subsection{Anchor configuration}  \subsection{Anchor configuration}
297  \label{sec:anchors}  \label{sec:anchors}
298    
299  The configuration of path name anchors to their corresponding  The association of path name anchors with their corresponding
300  directory names is a simple one-way mapping.  At startup time, this  directory names is a simple one-way mapping.  At startup time, this
301  mapping is initialized by reading two configuration files: an  mapping is initialized by reading two configuration files: an
302  installation-specific one and a user-specific one.  After that, the  installation-specific one and a user-specific one.  After that, the
# Line 302  Line 310 
310  CM\_PATHCONFIG\_DEFAULT}) at installation time by a path pointing to  CM\_PATHCONFIG\_DEFAULT}) at installation time by a path pointing to
311  wherever the installation actually puts the configuration file.  wherever the installation actually puts the configuration file.
312  The user can specify a new location at startup time using the  The user can specify a new location at startup time using the
313  environment variable {\tt CM\_PATCONFIG}.  environment variable {\tt CM\_PATHCONFIG}.
314    
315  The default location of the user-specific configuration file is {\tt  The default location of the user-specific configuration file is {\tt
316  .smlnj-pathconfig} in the user's home directory (which must be given  .smlnj-pathconfig} in the user's home directory (which must be given
317  by the {\tt HOME} environment varibale).  At startup time, this  by the {\tt HOME} environment variable).  At startup time, this
318  default can be overridden by a fixed location which must be given as  default can be overridden by a fixed location which must be given as
319  the value of the environment variable {\tt CM\_LOCAL\_PATHCONFIG}.  the value of the environment variable {\tt CM\_LOCAL\_PATHCONFIG}.
320    
# Line 325  Line 333 
333  \item A line with a single token that consists of a single minus sign  \item A line with a single token that consists of a single minus sign
334  {\bf -} cancels all existing anchors.  This typically makes sense only  {\bf -} cancels all existing anchors.  This typically makes sense only
335  at the beginning of the user-specific configuration file and  at the beginning of the user-specific configuration file and
336  eradicates any settings that were made by the installation-specific  erases any settings that were made by the installation-specific
337  configuration file.  configuration file.
338  \item Lines with no token (i.e., empty lines) will be silently ignored.  \item Lines with no token (i.e., empty lines) will be silently ignored.
339  \item Any other line is considered malformed and will cause a warning  \item Any other line is considered malformed and will cause a warning
# Line 369  Line 377 
377  After a library has been stabilized it can be used even if none of its  After a library has been stabilized it can be used even if none of its
378  original sources---including the description file---are present.  original sources---including the description file---are present.
379    
380    The boolean result of {\tt CM.recomp} and {\tt CM.stabilize} indicates
381    success or failure of the operation ({\tt true} = success).
382    
383  \subsubsection*{Linking}  \subsubsection*{Linking}
384    
385  In SML/NJ, linking means executing top-level code of each compilation  In SML/NJ, linking means executing top-level code of each compilation
# Line 401  Line 412 
412  {\tt CM.autoload} acts like {\tt CM.make}, only ``lazily''. See  {\tt CM.autoload} acts like {\tt CM.make}, only ``lazily''. See
413  Section~\ref{sec:autoload} for more information.  Section~\ref{sec:autoload} for more information.
414    
415    As before, the result of {\tt CM.make} indicates success or failure of
416    the operation.  The result of {\tt CM.autoload} indicates success or
417    failure of the {\em registration}.  (It does not know yet whether
418    loading will actually succeed.)
419    
420  \subsubsection*{Flags}  \subsubsection*{Flags}
421    
422  Several flags control the operation of CM.  Any invocation of the  Several flags control the operation of CM.  Any invocation of the
423  corresponding function reads the current value of the flag.  An  corresponding function reads the current value of the flag.  An
424  invocation with {\tt NONE} just reads it, an invocation with {\tt  invocation with {\tt NONE} just reads it, an invocation with {\tt
425  SOME} $v$ reads it and then replaces it with a new value $v$.  SOME} $v$ reads it and then replaces it with a new value $v$.  It is
426    always the old value that gets returned.
427    
428  \begin{verbatim}  \begin{verbatim}
429    val verbose : bool option -> bool    val verbose : bool option -> bool
# Line 423  Line 440 
440  In the case of a compile-time error {\tt CM.keep\_going} instructs the  In the case of a compile-time error {\tt CM.keep\_going} instructs the
441  {\tt CM.recomp} phase to continue working on parts of the dependency  {\tt CM.recomp} phase to continue working on parts of the dependency
442  graph that are not related to the error.  (This does not work for  graph that are not related to the error.  (This does not work for
443  syntax errors because a correct parse is needed before CM can  outright syntax errors because a correct parse is needed before CM can
444  construct its dependency graph.)  The default is {\em false} and can  construct the dependency graph.)  The default is {\em false} and can
445  be overriden at startup by the environment variable {\tt CM\_KEEP\_GOING}.  be overriden at startup by the environment variable {\tt
446    CM\_KEEP\_GOING}.
447    
448  {\tt CM.parse\_caching} sets a limit on how many parse trees are  {\tt CM.parse\_caching} sets a limit on how many parse trees are
449  cached in main memory.  In certain cases CM must parse source files in  cached in main memory.  In certain cases CM must parse source files in
450  order to be able to calculate the dependency graph.  Later, the same  order to be able to calculate the dependency graph.  Later, the same
451  files may need to be compiled, in which case an existing parse tree  files may need to be compiled, in which case an existing parse tree
452  saves the time to parse the file again.  Keeping parse trees can be  saves the time to parse the file again.  Keeping parse trees can be
453  expensive in memory usage.  Moreover, CM makes special efforts to  expensive in terms of memory usage.  Moreover, CM makes special
454  avoid parsing files unless they have actually been modified.  efforts to avoid re-parsing files in the first place unless they have
455  Therefore, it may not make much sense to set this value very high.  actually been modified.  Therefore, it may not make much sense to set
456  The default is {\em 100} and can be overriden at startup time by the  this value very high.  The default is {\em 100} and can be overriden
457  environment variable {\tt CM\_PARSE\_CACHING}.  at startup time by the environment variable {\tt CM\_PARSE\_CACHING}.
458    
459  This version of CM uses an ML-inspired syntax for expressions in its  This version of CM uses an ML-inspired syntax for expressions in its
460  conditional compilation subsystem.  However, for the time being it  conditional compilation subsystem (see Section~\ref{sec:preproc}).
461  will accept old C-inspired expressions but produce a warning for each  However, for the time being it will accept most of the original
462  occurrence. {\tt CM.warn\_obsolete} can be used to turn these warnings  C-inspired expressions but produces a warning for each occurrence of
463  off. The default is {\em true} and can be overriden at startup time by  an old-style operator. {\tt CM.warn\_obsolete} can be used to turn
464  the environment variable {\tt CM\_WARN\_OBSOLETE}.  these warnings off. The default is {\em true} and can be overriden at
465    startup time by the environment variable {\tt CM\_WARN\_OBSOLETE}.
466    
467  {\tt CM.debug} can be used to turn on debug mode.  This currently has  {\tt CM.debug} can be used to turn on debug mode.  This currently has
468  no effect since there is no separate debug mode. The default is {\em  no effect since there is no debug code in the implementation. The
469  false} and can be overriden at startup time by the environment  default is {\em false} and can be overriden at startup time by the
470  variable {\tt CM\_DEBUG}.  environment variable {\tt CM\_DEBUG}.
471    
472  \subsubsection*{Path anchors}  \subsubsection*{Path anchors}
473    
# Line 490  Line 509 
509    
510  {\tt CM.listLibs} lists to standard output the path names of library  {\tt CM.listLibs} lists to standard output the path names of library
511  description files for those stable libraries that are currently known  description files for those stable libraries that are currently known
512  to CM.  This list includes those libraries which have been accessed  to CM.  This list includes libraries which have been accessed
513  ``implicitly'' by virtue of being a sub-library of another library  ``implicitly'' by virtue of being a sub-library of another library
514  that has been accessed in the past.  Library state can take up  that has been accessed in the past.  Library state can take up
515  considerable space in main memory.  Use {\tt CM.dismissLib} (see  considerable space in main memory.  Use {\tt CM.dismissLib} (see
# Line 515  Line 534 
534  library.  Therefore, it is not always in the interest of  library.  Therefore, it is not always in the interest of
535  memory-conscious users to use this feature.  memory-conscious users to use this feature.
536    
537  Sharing of dynamic state created by the library is {\em not} affected  Sharing of link-time state created by the library is {\em not}
538  by this.  affected by this.
539    
540  {\tt CM.synchronize} updates tables internal to CM to reflect changes  {\tt CM.synchronize} updates tables internal to CM to reflect changes
541  in the file system.  In particular, this will be necessary when the  in the file system.  In particular, this will be necessary when the
# Line 545  Line 564 
564  before, it cannot be turned off since it provides many of the standard  before, it cannot be turned off since it provides many of the standard
565  pre-defined top-level bindings in the interactive system.  pre-defined top-level bindings in the interactive system.
566    
567  In essence, the autoloader is a convenient tool for virtually  The autoloader is a convenient tool for virtually ``loading'' an
568  ``loading'' an entire library without incurring an undue increase in  entire library without incurring an undue increase in memory
569  memory consumption for library modules that are not actually being  consumption for library modules that are not actually being used.
 used.  
570    
571  \subsection{Sharing of state}  \subsection{Sharing of state}
572  \label{sec:sharing}  \label{sec:sharing}
# Line 565  Line 583 
583  shared across multiple calls to {\tt CM.make} or {\tt CM.autoload}.  shared across multiple calls to {\tt CM.make} or {\tt CM.autoload}.
584  However, each such call incurs an associated {\em traversal} of the  However, each such call incurs an associated {\em traversal} of the
585  dependency graph, and during such a traversal each compilation unit  dependency graph, and during such a traversal each compilation unit
586  will be executed at most once.  In other words, the same program will  will be executed at most once.  In other words, the same ``program''
587  not see multiple instantiations of the same compilation unit.  will not see multiple instantiations of the same compilation unit
588    (where ``program'' refers to the code managed by one call to {\tt
589  As long as only {\tt CM.make} is involved, this is not difficult to  CM.make} or {\tt CM.autoload}).
590  describe since each traversal will have completed when the call to  
591  {\tt CM.make} returns.  However, that is not true in the case of {\tt  As long as only {\tt CM.make} is involved, this behavior is not
592  CM.autoload}.  {\tt CM.autoload} also initiates a traversal, but that  difficult to describe since each traversal will have completed when
593    the call to {\tt CM.make} returns.  However, that is not true in the
594    case of {\tt CM.autoload}.  Like {\tt CM.make}, {\tt CM.autoload}
595    initiates a traversal. But unlike in the case of {\tt CM.make}, that
596  traversal remains ``suspended'' and will be performed incrementally as  traversal remains ``suspended'' and will be performed incrementally as
597  necessary---driven by code compiled at the interactive top level.  And  necessary---driven by code entered at the interactive top level.  And
598  yet, it is still the case that each compilation unit will be linked at  yet, it is still the case that each compilation unit will be linked at
599  most once during this traversal and private state will not be confuse  most once during this traversal and private state will not be confused
600  with private state of other traversals that might be active at the same  with private state of other traversals that might be active at the
601  time.  same time.
602    
603  % Need a good example here.  % Need a good example here.
604    
605  \subsubsection*{Sharing annotations}  \subsubsection*{Sharing annotations}
606    
607    ML source files can be specified as being either {\em private} or {\em
608    shared}.  This is done by adding a {\em member class} specification
609    for the file in the library- or group description file (see
610    Section~\ref{sec:classes}).  In other words, to mark an ML file as
611    {\em private}, follow the file name with a colon {\bf :} and the word
612    {\tt private}.  For {\em shared} ML files, replace {\tt private} with
613    {\tt shared}.
614    
615    An ML source file that is not annotated will typically be treated as
616    {\em shared} unless it statically depends on some other {\em private}
617    source.  It is an error for a {\em shared} source to depend on a {\em
618    private} source.
619    
620    \subsubsection*{Sharing with the interactive system}
621    
622    The SML/NJ interactive system, which includes the compiler, is itself
623    created by linking various libraries, and some of these libraries can
624    also be used in user programs.  Examples are the Standard ML Basis
625    Library {\tt basis.cm}, the SML/NJ library {\tt smlnj-lib.cm}, and the
626    ML-Yacc library {\tt ml-yacc-lib.cm}.
627    
628    If a module from a library is used by both the interactive system and
629    a user program running under control of the interactive system, then
630    CM will let them share code and dynamic state.
631    
632    \section{Member classes and tools}
633    \label{sec:classes}
634    
635    In addition to using existing ML source files, CM can also invoke
636    tools that generate ML source code.  Examples for this are
637    program-generating programs such as ML-Yacc~\cite{tarditi90:yacc} or
638    ML-Lex~\cite{appel89:lex}, literate programming tools like
639    noweb~\cite{ramsey:simplified}, but also more generic ``generators''
640    such as the checkout program {\bf co} for RCS archives.
641    (Currently, CM knows ML-Yacc, ML-Lex, and ML-Burg, but other tools can
642    be added easily.)
643    
644    Typically, CM determines which tool to use by looking at clues like
645    the file name suffix.  However, it is also possible to explicitly tell
646    CM which tool to use by specifying the {\em member class} of the
647    source in the description file.  For this, the file name is followed
648    by a colon {\bf :} and the name of the member class.  Class names are
649    case-insensitive.
650    
651    In addition to genuine tool classes, there are a few member classes
652    that refer to facilities internal to CM: {\tt sml} is the class of
653    ordinary ML source files without sharing annotation, {\tt shared} is
654    the class of ML source files whose dynamic state must be shared across
655    invocations of {\tt CM.make} or {\tt CM.autoload}, {\tt private} is
656    the class of ML source files whose dynamic state cannot be shared
657    across invocations of {\tt CM.make} or {\tt CM.autoload}, and {\tt cm}
658    is the class of CM library or group description files.  Known tool
659    classes currently are {\tt mlyacc} for ML-Yacc sources, {\tt mllex}
660    for ML-Lex sources, and {\tt mlburg} for ML-Burg
661    sources~\cite{mlburg93}.
662    
663    CM automatically classifies file with a {\tt .sml} or {\tt .sig} suffix
664    as (unannotated) ML-source, file names ending in {\tt .cm}] as CM
665    descriptions, {\tt .grm} or {\tt .y} files as input to ML-Yacc, {\tt
666    .lex} or {\tt .l} as input to ML-Lex, and file names ending in {\tt
667    .burg} as ML-Burg specifications.
668    
669  \section{Conditional compilation}  \section{Conditional compilation}
670  \label{sec:preproc}  \label{sec:preproc}
671    
672    In its description files, CM files offers a simple conditional
673    compilation facility inspired by the pre-processor for the C
674    language~\cite{k&r2}.  However, it is not really a pre-processor, and
675    the syntax of the controlling expressions is borrowed from SML.
676    
677    Sequences of members can be guarded by {\tt \#if}-{\tt \#endif}
678    brackets with optional {\tt \#elif} and {\tt \#else} lines in between.
679    The same guarding syntax can also be used to conditionalize the export
680    list.  {\tt \#if}-, {\tt \#elif}-, {\tt \#else}-, and {\tt
681    \#endif}-lines must start in the first column and always
682    extend to the end of the current line.  {\tt \#if} and {\tt \#elif}
683    must be followed by a boolean expression.
684    
685    Boolean expressions can be formed by comparing arithmetic expressions
686    (using operators {\tt <}, {\tt <=}, {\tt =}, {\tt >=}, {\tt >}, or
687    {\tt <>}), by logically combining two other boolean expressions (using
688    operators {\tt andalso}, {\tt orelse}, {\tt =}, or {\tt <>}, by
689    querying the existence of a CM symbol definition, or by querying the
690    existence of an exported ML definition.
691    
692    Arithmetic expressions can be numbers or references to CM symbols, or
693    can be formed from other arithmetic expressions using operators {\tt
694    +}, {\tt -} (subtraction), \verb|*|, {\tt div}, {\tt mod}, or $\tilde{~}$
695    (unary minus).  All arithmetic is done on signed integers.
696    
697    Any expression (arithmetic or boolean) can be surrounded by
698    parentheses to enforce precedence.
699    
700    \subsection{CM variables}
701    
702    CM provides a number of names that stand for certain integers.  The
703    exact set of provided variable names depends on SML/NJ version number,
704    machine architecture, and operating system.  A reference to a CM
705    variable is considered an arithmetic expression. If the variable is
706    not defined, then it evaluates to 0.  The expression {\tt
707    defined}($v$) is a boolean expression that yields true if and only if
708    $v$ is a defined CM variable.
709    
710    The names of CM variables are formed starting with a letter followed
711    by zero or more occurences of letters, decimal digits, apostrophes, or
712    underscores.
713    
714    The following variables will be defined and bound to 1:
715    \begin{itemize}
716    \item depending on the operating system: {\tt OPSYS\_UNIX}, {\tt
717    OPSYS\_WIN32}, {\tt OPSYS\_MACOS}, {\tt OPSYS\_OS2}, or {\tt
718    OPSYS\_BEOS}
719    \item depending on processor architecture: {\tt ARCH\_SPARC}, {\tt
720    ARCH\_ALPHA32}, {\tt ARCH\_MIPS}, {\tt ARCH\_X86}, {\tt ARCH\_HPPA},
721    {\tt ARCH\_RS6000}, or {\tt ARCH\_PPC}
722    \item depending on the processor's endianness: {\tt BIG\_ENDIAN} or
723    {\tt LITTLE\_ENDIAN}
724    \item depending on the native word size of the implementation: {\tt
725    SIZE\_32} or {\tt SIZE\_64}
726    \item the symbol {\tt NEW\_CM}
727    \end{itemize}
728    
729    Furthermore, the symbol {\tt SMLNJ\_VERSION} will be bound to the
730    major version number of SML/NJ (i.e., the number before the first dot)
731    and {\tt SMLNJ\_MINOR\_VERSION} will be bound to the system's minor
732    version number (i.e., the number after the first dot).
733    
734    \subsection{Querying exported definitions}
735    
736    An expression of the form {\tt defined}($n$ $s$) where $s$ is an ML
737    symbol and $n$ is an ML namespace specifier is a boolean expression
738    that yields true if and only if any member included before this test
739    exports a definition under this name.  Therefore, order among members
740    matters after all (but it remains unrelated to the problem of
741    determining static dependencies)!  The namespace specifier must be one
742    of: {\tt structure}, {\tt signature}, {\tt functor}, or {\tt funsig}.
743    
744    If the query takes place in the ``exports'' section of a description
745    file, then it yields true if {\em any} of the included members exports
746    the named symbol.
747    
748    \noindent Example:
749    
750    \begin{verbatim}
751    Library
752      structure Foo
753    #if defined(structure Bar)
754      structure Bar
755    #endif
756    is
757    #if SMLNJ_VERSION > 110
758      new-foo.sml
759    #else
760      old-foo.sml
761    #endif
762    #if defined(structure Bar)
763      bar-client.sml
764    #else
765      no-bar-so-far.sml
766    #endif
767    \end{verbatim}
768    
769    Here, the file {\tt bar-client.sml} gets included if {\tt
770    SMLNJ\_VERSION} is greater than 110 and {\tt new-foo.sml} exports a
771    structure {\tt Bar} {\em or} if {\tt SMLNJ\_VERSION <= 110} and {\tt
772    old-foo.sml} exports structure {\tt Bar}. Otherwise {\tt
773    no-bar-so-far.sml} gets included instead.  In addition, the export of
774    structure {\tt Bar} is guarded by its own existence.  (Structure {\tt
775    Bar} could also be defined by {\tt no-bar-so-far.sml} in which case it
776    would get exported regardless of the outcome of the other {\tt
777    defined} test.)
778    
779    \subsection{Explicit errors}
780    
781    A pseudo-member of the form {\tt \#error $\ldots$} which---like other
782    {\tt \#}-items---starts in the first column and extends to the end of
783    the line causes an explicit error message unless it gets excluded by
784    the conditional compilation logic.  The error message is given by the
785    remainder of the line after the word {\tt error}.
786    
787    \subsection{BNF for expressions}
788    
789    \begin{tabbing}
790    \nt{non-terminal}~\= \ar \kill
791    \nt{letter} \> \ar \tl{A} \vb $\ldots$ \vb \tl{Z} \vb
792                       \tl{a} \vb $\ldots$ \vb \tl{z} \\
793    \nt{digit}  \> \ar \tl{0} \vb $\ldots$ \vb \tl{9} \\
794    \nt{ldau}   \> \ar \nt{letter} \vb \nt{digit} \vb \tl{'} \vb \tl{\_} \\
795    \\
796    \nt{number} \> \ar \nt{digit} \{\nt{digit}\} \\
797    \nt{sym}    \> \ar \nt{letter} \{\nt{ldau}\} \\
798    \\
799    \nt{aatom}  \> \ar \nt{number} \vb \nt{sym} \vb \tl{(} \nt{asum} \tl{)} \vb
800                       \tl{$\tilde{~}$} \nt{aatom} \\
801    \nt{aprod}  \> \ar \{\nt{aprod} (\tl{*} \vb \tl{div} \vb \tl{mod})\} \nt{aatom} \\
802    \nt{asum}   \> \ar \{\nt{asum} (\tl{+} \vb \tl{-})\} \nt{aprod} \\
803    \\
804    \nt{ns}     \> \ar \tl{structure} \vb \tl{signature} \vb \tl{functor} \vb
805                       \tl{funsig} \\
806    \nt{mlsym}  \> \ar {\em a Standard ML identifier} \\
807    \nt{query}  \> \ar \tl{defined} \tl{(} \nt{sym} \tl{)} \vb
808                       \tl{defined} \tl{(} \nt{ns} \nt{mlsym} \tl{)} \vb \\
809    \\
810    \nt{acmp}   \> \ar \nt{aexp} (\ttl{<} \vb \ttl{<=} \vb \ttl{>} \vb \ttl{>=} \vb
811                                  \ttl{=} \vb \ttl{<>}) \nt{aexp} \\
812    \\
813    \nt{batom}  \> \ar \nt{query} \vb \nt{acmp} \vb \tl{not} \nt{batom} \vb
814                       \tl{(} \nt{bdisj} \tl{)} \\
815    \nt{bcmp}   \> \ar \nt{batom} [(\ttl{=} \vb \ttl{<>}) \nt{batom}] \\
816    \nt{bconj}  \> \ar \{\nt{bcmp} \tl{andalso}\} \nt{bcmp} \\
817    \nt{bdisj}  \> \ar \{\nt{bconj} \tl{orelse}\} \nt{bdisj} \\
818    \\
819    \nt{expression} \> \ar \nt{bdisj}
820    \end{tabbing}
821    
822  \section{Access control}  \section{Access control}
823  \label{sec:access}  \label{sec:access}
824    
825    Access control is a new feature. At the moment, only the basic
826    mechanisms are implemented, but there is no enforcement.  In other
827    words, everybody is assumed to have every possible privilege.  CM
828    merely reports which privileges "would have been required".
829    
830    The basic idea behind CM's access control is the following: In their
831    description files groups and libraries can specify a list of
832    privileges that the client must have in order to be able to use it.
833    Privileges at this level are just names (strings) and must be written
834    in front of the initial keyword {\tt Library} or {\tt Group}.  If one
835    group or library imports from another group or library, then
836    privileges (or rather: privilege requirements) are being inherited.
837    In effect, to be able to use a program, one must have all privileges
838    for all its libraries, sub-libraries and library components,
839    components of sub-libraries, and so on.
840    
841    Of course, this alone would not yet be satisfactory because there
842    should also be the possibility of setting up a ``safety wall:'' a
843    library {\tt LSafe.cm} could ``wrap'' all the unsafe operations in
844    {\tt LUnsafe.cm} with enough error checking that they become safe.
845    Therefore, a user of {\tt LSafe.cm} should not also be required to
846    possess the privileges that would be required if one were to use {\tt
847    LUnsafe.cm} directly.
848    
849    In CM's access control model it is possible for a library to ``wrap''
850    privileges.  If a privilege $P$ has been wrapped, then the user of the
851    library does not need to have privilege $P$ even though the library is
852    using another library that requires privilege $P$.  In essence, the
853    library acts as a ``proxy'' who provides the necessary credentials for
854    privilege $P$ to the sub-library.
855    
856    Of course, not everybody can be allowed to establish a library with
857    such a ``wrapped'' privilege $P$.  The programmer who does that should at
858    least herself have privilege P (but perhaps better, she should have
859    {\em permission to wrap $P$}---a stronger requirement).
860    
861    In CM, wrapping a privilege is done by specifying the name of that
862    privilege within parenthesis.  The wrapping becomes effective once the
863    library gets stabilized via {\tt CM.stabilize}.  The (not yet
864    implemented) enforcement mechanism must ensure that anyone who
865    stabilizes a library that wraps $P$ has permission to wrap $P$.
866    
867    Note that privileges cannot be wrapped at the level of CM groups.
868    
869    \section{The pervasive environment and primitive modules}
870    
871    \subsection{The pervasive environment}
872    
873    The {\em pervasive environment} can be thought of as a library that
874    all compilation units implicitly depend upon.  The pervasive
875    enviroment exports all non-modular bindings (types, values, infix
876    operators, overloaded symbols) that are mandated by the specification
877    for the Standard ML Basis Library~\cite{reppy99:basis}.  (All other
878    bindings of the Basis Library are exported by {\tt basis.cm} which is
879    a genuine CM library.)
880    
881    The pervasive environment is the only place where CM conveys
882    non-modular bindings from one compilation unit to another.
883    
884    \subsection{Primitive modules}
885    
886    CM also knows about some ``primitive'' modules.  These modules give
887    access to certain compiler internals which are implemented in a way
888    that is outside the usual CM compilation model.  A user program can
889    access a primitive module by listing its name as one of the members in
890    its CM description files.  However, usage of any primitive module $M$
891    is protected by requiring the client to possess  privilege $M$
892    (see Section~\ref{sec:access}), i.e., a privilege that goes by the
893    same name as the module itself.
894    
895    Currently, the following primitive module names are known: {\tt
896    built-in}, {\tt print-hook}, {\tt use-hook}, {\tt exn-info-hook}, {\tt
897    core}, {\tt init-utils}.
898    
899  \section{Some history}  \section{Some history}
900    
901  Although its programming model is more general, CM's implementation is  Although its programming model is more general, CM's implementation is

Legend:
Removed from v.414  
changed lines
  Added in v.415

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0