Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/src/cm/Doc/manual.tex
ViewVC logotype

Diff of /sml/trunk/src/cm/Doc/manual.tex

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 407, Thu Sep 2 09:16:50 1999 UTC revision 408, Thu Sep 2 14:22:37 1999 UTC
# Line 29  Line 29 
29    
30  This manual describes a new implementation of CM, the ``Compilation  This manual describes a new implementation of CM, the ``Compilation
31  and Library Manager'' for Standard ML of New Jersey (SML/NJ).  Like its  and Library Manager'' for Standard ML of New Jersey (SML/NJ).  Like its
32  previous version, CM is in change of managing separate compilation and  previous version, CM is in charge of managing separate compilation and
33  facilitates access to stable libraries.  facilitates access to stable libraries.
34    
35  Programming projects that use CM are typically composed of separate  Programming projects that use CM are typically composed of separate
36  {\em libraries}.  Libraries themselves can be internally  {\em libraries}.  Libraries themselves can be internally
37  sub-structured using CM's notion of {\em groups}.  Using libraries and  sub-structured using CM's notion of {\em groups}.  Using libraries and
38  groups, programs can be viewed as a {\em hierarchy of modules}.  Large  groups, programs can be viewed as a {\em hierarchy of modules}.  The
39  projects tend to benefit from this approach~\cite{blume:appel:cm99}.  organization of large projects tends to benefit from this
40    approach~\cite{blume:appel:cm99}.
41    
42  CM uses {\em cutoff} techniques~\cite{tichy94} to minimize  CM uses {\em cutoff} techniques~\cite{tichy94} to minimize
43  recompilation work and provides automatic dependency analysis to free  recompilation work and provides automatic dependency analysis to free
44  the programmer from having to specify the module dependency graph by  the programmer from having to specify a detailed module dependency
45  hand~\cite{blume:depend99}.  graph by hand~\cite{blume:depend99}.
46    
47  The most important change with repect to the previous (``old'')  The most important change with repect to the previous (``old'')
48  implementation of CM is a change of emphasis.  Until now the focus was  implementation of CM is a change of emphasis.  Until now the focus was
# Line 72  Line 73 
73      basis.cm      basis.cm
74  \end{verbatim}  \end{verbatim}
75    
76  This library consists of three ML source files ({\tt foo.sig}, {\tt  This library exports two definitions, one for a structure named {\tt
77  foo.sml}, and {\tt helper.sml}) and refers to one external library  Foo} and one for a signature named {\tt FOO}.  The specification for
78  ({\tt basis.cm}).  It exports two definitions, one for a structure  such exports appear between the keywords {\tt Library} and {\tt is}.
79  named {\tt Foo} and one for a signature named {\tt FOO}.  The entry  The {\em members} of the library are specified after the keyword {\tt
80  {\tt basis.cm} typically denotes the description file of the {\it  is}.  Here we have three ML source files ({\tt foo.sig}, {\tt
81  Standard ML Basis Library}~\cite{reppy99:basis}; most programs will  foo.sml}, and {\tt helper.sml}) and a reference to one external
82  want to list it in their description file(s).  library ({\tt basis.cm}).  The entry {\tt basis.cm} typically denotes
83    the description file for the {\it Standard ML Basis
84    Library}~\cite{reppy99:basis}; most programs will want to list it in
85    their own description file(s).
86    
87  \subsection{Library descriptions}  \subsection{Library descriptions}
88    
# Line 90  Line 94 
94  \item All top-level definitions must be {\em module} definitions  \item All top-level definitions must be {\em module} definitions
95  (structures, signatures, functors, or functor signatures).  In other  (structures, signatures, functors, or functor signatures).  In other
96  words, there can be no top-level type-, value-, or infix-definitions.  words, there can be no top-level type-, value-, or infix-definitions.
97  \item For a given symbol, there can be at most one file per library  \item For a given symbol, there can be at most one ML source file per
98  (or---more correctly---one file per library component; see  library (or---more correctly---one file per library component; see
99  Section~\ref{sec:groups}) that defines the symbol at top level.  Section~\ref{sec:groups}) that defines the symbol at top level.
100    \item For a given symbol, there can be at most one sub-library or one
101    sub-group that exports that symbol.
102  \item The use of ML's {\bf open} construct is not permitted at top  \item The use of ML's {\bf open} construct is not permitted at top
103  level.  level.
104  \end{enumerate}  \end{enumerate}
105    
106    Note that these rules do not require the exports of sub-groups or
107    sub-libraries to be distinct from the exports of ML source files.
108    Here, the disambiguating rule is that the definition from the ML
109    source overrides the definition imported from the group or library.
110    
111  The full syntax for library description files also includes provisions  The full syntax for library description files also includes provisions
112  for a simple ``conditional compilation'' facility (see  for a simple ``conditional compilation'' facility (see
113  Section~\ref{sec:preproc}), for access control (see  Section~\ref{sec:preproc}), for access control (see
# Line 144  Line 155 
155  \item The initial keyword {\tt Library} is replaced with {\tt Group}  \item The initial keyword {\tt Library} is replaced with {\tt Group}
156  followed by the name of the surrounding library's description file in  followed by the name of the surrounding library's description file in
157  parentheses.  parentheses.
158  \item The export list can be left empty.  In this case CM will  \item The export list can be left empty, in which case CM will
159  provide a default export list: all exports from ML source files plus  provide a default export list: all exports from ML source files plus
160  all exports from sub-components of the component.  (Note that this does  all exports from sub-components of the component.  (Note that this does
161  not include the exports of other libraries.)  not include the exports of other libraries.)
# Line 177  Line 188 
188      basis.cm      basis.cm
189  \end{verbatim}  \end{verbatim}
190    
191    \subsection{Multiple occurences of the same member}
192    
193    The following rules apply to multiple occurences of the same ML source
194    file, the same library, or the same group within a program:
195    
196    \begin{itemize}
197    \item Within the same description file, each member can be specified
198    at most once.
199    \item Libraries can be referred to freely from as many other groups or
200    libraries as the programmer desires.
201    \item Each group cannot be used from outside the (uniquely defined)
202    library that it is a component of.  However, within that library it
203    can be referred to from arbitrarily many other groups.
204    \item The same ML source file cannot appear more than once.  If an ML
205    source file is to be referred to by multiple clients it must first be
206    ``wrapped'' into a library or (if sufficient) a group.
207    \end{itemize}
208    
209    \subsection{Top-level groups}
210    
211    Mainly to facilitate some superficial backward-compatibility, CM also
212    allows groups to appear at top level, i.e., outside of any library.
213    Such groups must omit the parenthetical library specification and then
214    cannot also be used within libraries. One could think of the top level
215    itself as a ``virtual unnamed library'' whose components top-level
216    groups are.
217    
218  \section{Naming objects in the file system}  \section{Naming objects in the file system}
219    
220  \subsection{Motivation}  \subsection{Motivation}
# Line 349  Line 387 
387    
388  During the course of the same {\tt CM.make}, the code of each  During the course of the same {\tt CM.make}, the code of each
389  compilation module will be executed at most once.  Code in units that  compilation module will be executed at most once.  Code in units that
390  are marked as {\it private} will be executed exactly once.  Code in  are marked as {\it private} (see Section~\ref{sec:sharing}) will be
391  other units will be executed only if the unit has been recompiled  executed exactly once.  Code in other units will be executed only if
392  since it was executed last time or if it depends on another  the unit has been recompiled since it was executed last time or if it
393  compilation unit whose code has been executed since.  depends on another compilation unit whose code has been executed
394    since.
395    
396  In effect, different invocations of {\tt CM.make} (and {\tt  In effect, different invocations of {\tt CM.make} (and {\tt
397  CM.autoload}) will share dynamic state created at link time as much as  CM.autoload}) will share dynamic state created at link time as much as
398  possible unless the compilation units in question have been explicitly  possible unless the compilation units in question have been explicitly
399  marked private.  marked private.
400    
401  {\tt CM.autoload} acts like a ``lazy'' {\tt CM.make}.  The same  {\tt CM.autoload} acts like {\tt CM.make}, only ``lazily''. See
402  bindings that {\tt CM.make} would introduce into the top-level  Section~\ref{sec:autoload} for more information.
 enviroment are also introduced by the corresponding {\tt CM.autoload}.  
 However, most work will be deferred until some code entered at the  
 interactive top level later mentions one or more of the exported  
 symbols.  Only then will CM go and perform just the minimal work  
 necessary to provide the actual definitions for them.  
403    
404  \subsubsection*{Flags}  \subsubsection*{Flags}
405    
# Line 383  Line 417 
417  \end{verbatim}  \end{verbatim}
418    
419  {\tt CM.verbose} can be used to turn off CM's progress messages.  The  {\tt CM.verbose} can be used to turn off CM's progress messages.  The
420  default is {\em on}.  default is {\em true} and can be overriden at startup time by the
421    environment variable {\tt CM\_VERBOSE}.
422    
423  In the case of a compile-time error {\tt CM.keep\_going} instructs the  In the case of a compile-time error {\tt CM.keep\_going} instructs the
424  {\tt CM.recomp} phase to continue working on parts of the dependency  {\tt CM.recomp} phase to continue working on parts of the dependency
425  graph that are not related to the error.  (This does not work for  graph that are not related to the error.  (This does not work for
426  syntax errors because a correct parse is needed before CM can  syntax errors because a correct parse is needed before CM can
427  construct its dependency graph.)  The default is {\em off}.  construct its dependency graph.)  The default is {\em false} and can
428    be overriden at startup by the environment variable {\tt CM\_KEEP\_GOING}.
429    
430  {\tt CM.parse\_caching} sets a limit on how many parse trees are  {\tt CM.parse\_caching} sets a limit on how many parse trees are
431  cached in main memory.  In certain cases CM must parse source files in  cached in main memory.  In certain cases CM must parse source files in
# Line 399  Line 435 
435  expensive in memory usage.  Moreover, CM makes special efforts to  expensive in memory usage.  Moreover, CM makes special efforts to
436  avoid parsing files unless they have actually been modified.  avoid parsing files unless they have actually been modified.
437  Therefore, it may not make much sense to set this value very high.  Therefore, it may not make much sense to set this value very high.
438  The default is {\em 100}.  The default is {\em 100} and can be overriden at startup time by the
439    environment variable {\tt CM\_PARSE\_CACHING}.
440    
441  This version of CM uses an ML-inspired syntax for expressions in its  This version of CM uses an ML-inspired syntax for expressions in its
442  conditional compilation subsystem.  However, for the time being it  conditional compilation subsystem.  However, for the time being it
443  will accept old C-inspired expressions but produce a warning for each  will accept old C-inspired expressions but produce a warning for each
444  occurrence. {\tt CM.warn\_obsolete} can be used to turn these warnings  occurrence. {\tt CM.warn\_obsolete} can be used to turn these warnings
445  off. The default is {\em on}.  off. The default is {\em true} and can be overriden at startup time by
446    the environment variable {\tt CM\_WARN\_OBSOLETE}.
447    
448  {\tt CM.debug} can be used to turn on debug mode.  This currently has  {\tt CM.debug} can be used to turn on debug mode.  This currently has
449  no effect since there is no separate debug mode. The default is {\em off}.  no effect since there is no separate debug mode. The default is {\em
450    false} and can be overriden at startup time by the environment
451    variable {\tt CM\_DEBUG}.
452    
453  \subsubsection*{Path anchors}  \subsubsection*{Path anchors}
454    
# Line 443  Line 483 
483    val listLibs : unit -> unit    val listLibs : unit -> unit
484  \end{verbatim}  \end{verbatim}
485    
486  {\tt CM.showPending} lists the names of all symbols which are  {\tt CM.showPending} lists to standard output the names of all symbols
487  currently registered as being bound at top level via the autoloading  which are currently registered as being bound at top level via the
488  mechanism and which so far have not actually been resolved.  autoloading mechanism and which so far have not actually been
489    resolved.
490  {\tt CM.listLibs} shows the path names of library description files  
491  for those stable libraries that are currently known to CM.  This list  {\tt CM.listLibs} lists to standard output the path names of library
492  includes those libraries which have been accessed ``implicitly'' by  description files for those stable libraries that are currently known
493  virtue of being a sub-library of another library that has been  to CM.  This list includes those libraries which have been accessed
494  accessed in the past.  Library state can take up considerable space in  ``implicitly'' by virtue of being a sub-library of another library
495  main memory.  Use {\tt CM.dismissLib} (see below) to remove a library  that has been accessed in the past.  Library state can take up
496  from CM's registry.  considerable space in main memory.  Use {\tt CM.dismissLib} (see
497    below) to remove a library from CM's registry.
498    
499  \subsubsection*{Altering CM's internal state}  \subsubsection*{Altering CM's internal state}
500    
501  Sometimes it can become necessary to explicitly instruct CM to change  Sometimes it can become necessary to explicitly change or update CM's
502  or update its internal state.  internal state.
503    
504  \begin{verbatim}  \begin{verbatim}
505    val dismissLib : string -> unit    val dismissLib : string -> unit
# Line 471  Line 512 
512  Although removing a library from the registry may recover considerable  Although removing a library from the registry may recover considerable
513  amounts of main memory, doing so also eliminates any chance of sharing  amounts of main memory, doing so also eliminates any chance of sharing
514  the associated data structures with later references to the same  the associated data structures with later references to the same
515  library.  Therefore, doing so is not always in the interest of  library.  Therefore, it is not always in the interest of
516  memory-conscious users.  memory-conscious users to use this feature.
517    
518    Sharing of dynamic state created by the library is {\em not} affected
519    by this.
520    
521  {\tt CM.synchronize} updates tables internal to CM to reflect changes  {\tt CM.synchronize} updates tables internal to CM to reflect changes
522  in the file system.  In particular, this will be necessary when the  in the file system.  In particular, this will be necessary when the
523  association of file names to ``file IDs'' (in Unix: inode numbers)  association of file names to ``file IDs'' (in Unix: inode numbers)
524  changes during an ongoing session.  In practice, this tends to be  changes during an ongoing session.  In practice, the need for this
525  rare.  tends to be rare.
526    
527  {\tt CM.reset} completely erases all internal state in CM.  This is  {\tt CM.reset} completely erases all internal state in CM.  This is
528  not very advisable since it will also break the association with  not very advisable since it will also break the association with
# Line 488  Line 532 
532  \subsection{The auto loader}  \subsection{The auto loader}
533  \label{sec:autoload}  \label{sec:autoload}
534    
535    From the user's point of view, a call to {\tt CM.autoload} acts very
536    much like the corresponding call ot {\tt CM.make} because the same
537    bindings that {\tt CM.make} would introduce into the top-level
538    enviroment are also introduced by {\tt CM.autoload}.  However, most
539    work will be deferred until some code entered later at the interactive
540    top level refers to one or more of these bindings.  Only then will CM
541    go and perform just the minimal work necessary to provide the actual
542    definitions.
543    
544    In this version of CM the autoloader plays a central role.  Unlike
545    before, it cannot be turned off since it provides many of the standard
546    pre-defined top-level bindings in the interactive system.
547    
548    In essence, the autoloader is a convenient tool for virtually
549    ``loading'' an entire library without incurring an undue increase in
550    memory consumption for library modules that are not actually being
551    used.
552    
553    \subsection{Sharing of state}
554    \label{sec:sharing}
555    
556    By default, CM tries to let multiple invocations of {\tt CM.make} or
557    {\tt CM.autoload} share dynamic state created by link-time effects.
558    Of course, this is not possible if the compilation unit in question
559    has recently been recompiled or depends on another compilation unit
560    whose code has recently been re-executed.  The programmer can
561    explicitly mark certain ML files as {\em shared}, in which case CM
562    will issue a warning whenever the unit's code gets re-executed.
563    
564    State created by compilation units marked as {\em private} is never
565    shared across multiple calls to {\tt CM.make} or {\tt CM.autoload}.
566    However, each such call incurs an associated {\em traversal} of the
567    dependency graph, and during such a traversal each compilation unit
568    will be executed at most once.  In other words, the same program will
569    not see multiple instantiations of the same compilation unit.
570    
571    As long as only {\tt CM.make} is involved, this is not difficult to
572    describe since each traversal will have completed when the call to
573    {\tt CM.make} returns.  However, that is not true in the case of {\tt
574    CM.autoload}.  {\tt CM.autoload} also initiates a traversal, but that
575    traversal remains ``suspended'' and will be performed incrementally as
576    necessary---driven by code compiled at the interactive top level.  And
577    yet, it is still the case that each compilation unit will be linked at
578    most once during this traversal and private state will not be confuse
579    with private state of other traversals that might be active at the same
580    time.
581    
582    % Need a good example here.
583    
584    \subsubsection*{Sharing annotations}
585    
586    
587    
588  \section{Conditional compilation}  \section{Conditional compilation}
589  \label{sec:preproc}  \label{sec:preproc}
590    

Legend:
Removed from v.407  
changed lines
  Added in v.408

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0