13 |
|
|
14 |
\title{{\bf CM}\\ |
\title{{\bf CM}\\ |
15 |
The SML/NJ Compilation and Library Manager \\ |
The SML/NJ Compilation and Library Manager \\ |
16 |
{\it\small (for SML/NJ version 110.20 and later)} \\ |
{\it\small (for SML/NJ version 110.23 and later)} \\ |
17 |
User Manual} |
User Manual} |
18 |
|
|
19 |
\setlength{\parindent}{0pt} |
\setlength{\parindent}{0pt} |
39 |
facilitates access to stable libraries. |
facilitates access to stable libraries. |
40 |
|
|
41 |
Programming projects that use CM are typically composed of separate |
Programming projects that use CM are typically composed of separate |
42 |
{\em libraries}. Libraries themselves can be internally |
{\em libraries}. Libraries are collections of ML compilation units |
43 |
sub-structured using CM's notion of {\em groups}. Using libraries and |
and themselves can be internally sub-structured using CM's notion of |
44 |
groups, programs can be viewed as a {\em hierarchy of modules}. The |
{\em groups}. Using libraries and groups, programs can be viewed as a |
45 |
organization of large projects tends to benefit from this |
{\em hierarchy of modules}. The organization of large projects tends |
46 |
approach~\cite{blume:appel:cm99}. |
to benefit from this approach~\cite{blume:appel:cm99}. |
47 |
|
|
48 |
CM uses {\em cutoff} techniques~\cite{tichy94} to minimize |
CM uses {\em cutoff} techniques~\cite{tichy94} to minimize |
49 |
recompilation work and provides automatic dependency analysis to free |
recompilation work and provides automatic dependency analysis to free |
50 |
the programmer from having to specify a detailed module dependency |
the programmer from having to specify a detailed module dependency |
51 |
graph by hand~\cite{blume:depend99}. |
graph by hand~\cite{blume:depend99}. |
52 |
|
|
53 |
This new version of CM emphasis on {\em working with libraries}. This |
This new version of CM emphasizes {\em working with libraries}. This |
54 |
contrasts with the previous implementation where the focus was on |
contrasts with the previous implementation where the focus was on |
55 |
compilation management while libraries were added as an afterthought. |
compilation management while libraries were added as an afterthought. |
56 |
Beginning now, CM takes a very library-centric view of the world. In |
Beginning now, CM takes a very library-centric view of the world. In |
59 |
|
|
60 |
\section{The CM model} |
\section{The CM model} |
61 |
|
|
62 |
A CM library is a collection of ML source files and references to |
A CM library is a collection of ML source files and may also contain |
63 |
other libraries together with an explicit export interface. The |
references to other libraries together with an explicit export |
64 |
export interface lists all toplevel-defined symbols of the library |
interface. The export interface lists all toplevel-defined symbols of |
65 |
that shall be exported to its clients. A library is described by the |
the library that shall be exported to its clients. A library is |
66 |
contents of its {\em description file}. |
described by the contents of its {\em description file}. |
67 |
|
|
68 |
\noindent Example: |
\noindent Example: |
69 |
|
|
70 |
\begin{verbatim} |
\begin{verbatim} |
71 |
Library |
Library |
72 |
signature FOO |
signature BAR |
73 |
structure Foo |
structure Foo |
74 |
is |
is |
75 |
foo.sig |
bar.sig |
76 |
foo.sml |
foo.sml |
77 |
helper.sml |
helper.sml |
78 |
basis.cm |
basis.cm |
79 |
\end{verbatim} |
\end{verbatim} |
80 |
|
|
81 |
This library exports two definitions, one for a structure named {\tt |
This library exports two definitions, one for a structure named {\tt |
82 |
Foo} and one for a signature named {\tt FOO}. The specification for |
Foo} and one for a signature named {\tt BAR}. The specification for |
83 |
such exports appear between the keywords {\tt Library} and {\tt is}. |
such exports appear between the keywords {\tt Library} and {\tt is}. |
84 |
The {\em members} of the library are specified after the keyword {\tt |
The {\em members} of the library are specified after the keyword {\tt |
85 |
is}. Here we have three ML source files ({\tt foo.sig}, {\tt |
is}. Here we have three ML source files ({\tt bar.sig}, {\tt |
86 |
foo.sml}, and {\tt helper.sml}) and a reference to one external |
foo.sml}, and {\tt helper.sml}) and a reference to one external |
87 |
library ({\tt basis.cm}). The entry {\tt basis.cm} typically denotes |
library ({\tt basis.cm}). The entry {\tt basis.cm} typically denotes |
88 |
the description file for the {\it Standard ML Basis |
the description file for the {\it Standard ML Basis |
111 |
Note that these rules do not require the exports of sub-groups or |
Note that these rules do not require the exports of sub-groups or |
112 |
sub-libraries to be distinct from the exports of ML source files. |
sub-libraries to be distinct from the exports of ML source files. |
113 |
Here, the disambiguating rule is that the definition from the ML |
Here, the disambiguating rule is that the definition from the ML |
114 |
source overrides the definition imported from the group or library. |
source takes precedence over the definition imported from the group or |
115 |
|
library. |
116 |
|
|
117 |
The full syntax for library description files also includes provisions |
The full syntax for library description files also includes provisions |
118 |
for a simple ``conditional compilation'' facility (see |
for a simple ``conditional compilation'' facility (see |
124 |
|
|
125 |
In general, all definitions exported from members of a library are |
In general, all definitions exported from members of a library are |
126 |
visible in all ML source files of that library. The source code in |
visible in all ML source files of that library. The source code in |
127 |
those source files can refer to them directly. Here, ``exported'' |
those source files can refer to them directly without further |
128 |
means either a top-level definition within an ML source file or a |
qualification. Here, ``exported'' means either a top-level definition |
129 |
definition listed in a (sub-)library's export list. |
within an ML source file or a definition listed in a (sub-)library's |
130 |
|
export list. |
131 |
|
|
132 |
If a library is structured into library components using {\em groups} |
If a library is structured into library components using {\em groups} |
133 |
(see Section~\ref{sec:groups}), then---as far as name visibility is |
(see Section~\ref{sec:groups}), then---as far as name visibility is |
140 |
\label{sec:groups} |
\label{sec:groups} |
141 |
|
|
142 |
CM's group model eliminates a whole class of potential naming problems |
CM's group model eliminates a whole class of potential naming problems |
143 |
by providing control over name spaces for program linkage. This has |
by providing control over name spaces for program linkage. As has been |
144 |
been described separately~\cite{blume:appel:cm99} but it sometimes |
described separately~\cite{blume:appel:cm99}, it sometimes involves |
145 |
involves the use of ``administrative'' libraries whose sole purpose is |
the use of ``administrative'' libraries whose sole purpose is to |
146 |
to rename certain definitions. |
rename certain definitions. |
147 |
|
|
148 |
However, under CM, ``library'' does not only refer to namespace |
However, under CM, the term ``library'' does not only mean namespace |
149 |
management but often also to an actual file system object. It would |
management but also refers to actual file system objects. It would be |
150 |
be inconvenient if name resolution problems would result in a |
inconvenient if name resolution problems would result in a |
151 |
proliferation of additional library files. Therefore, CM also |
proliferation of additional library files. Therefore, CM also |
152 |
provides the notion of groups (or: library components). Name |
provides the notion of groups (or: library components). Name |
153 |
resolution for groups works like name resolution for entire libraries, |
resolution for groups works like name resolution for entire libraries, |
159 |
description files with the following exceptions: |
description files with the following exceptions: |
160 |
|
|
161 |
\begin{itemize} |
\begin{itemize} |
162 |
\item The initial keyword {\tt Library} is replaced with {\tt Group} |
\item The initial keyword {\tt Library} is replaced with {\tt Group}. |
163 |
followed by the name of the surrounding library's description file in |
It is followed by the name of the surrounding library's description |
164 |
parentheses. |
file in parentheses. |
165 |
\item The export list can be left empty, in which case CM will |
\item The export list can be left empty, in which case CM will |
166 |
provide a default export list: all exports from ML source files plus |
provide a default export list: all exports from ML source files plus |
167 |
all exports from sub-components of the component. (Note that this does |
all exports from sub-components of the component. (Note that this does |
209 |
library that it is a component of. However, within that library it |
library that it is a component of. However, within that library it |
210 |
can be referred to from arbitrarily many other groups. |
can be referred to from arbitrarily many other groups. |
211 |
\item The same ML source file cannot appear more than once. If an ML |
\item The same ML source file cannot appear more than once. If an ML |
212 |
source file is to be referred to by multiple clients it must first be |
source file is to be referred to by multiple clients, it must first be |
213 |
``wrapped'' into a library (or---if that's sufficient---a group). |
``wrapped'' into a library (or---if all references are from within the |
214 |
|
same library---a group). |
215 |
\end{itemize} |
\end{itemize} |
216 |
|
|
217 |
\subsection{Top-level groups} |
\subsection{Top-level groups} |
220 |
allows groups to appear at top level, i.e., outside of any library. |
allows groups to appear at top level, i.e., outside of any library. |
221 |
Such groups must omit the parenthetical library specification and then |
Such groups must omit the parenthetical library specification and then |
222 |
cannot also be used within libraries. One could think of the top level |
cannot also be used within libraries. One could think of the top level |
223 |
itself as a ``virtual unnamed library''. Top-level groups are then |
itself as a ``virtual unnamed library'' whose components are these |
224 |
components of this virtual library. |
top-level groups. |
225 |
|
|
226 |
\section{Naming objects in the file system} |
\section{Naming objects in the file system} |
227 |
|
|
234 |
The main difficulty lies in the fact that files or even whole |
The main difficulty lies in the fact that files or even whole |
235 |
directories may move after CM has already partially (but not fully) |
directories may move after CM has already partially (but not fully) |
236 |
processed them. For example, this happens when the {\em autoloader} |
processed them. For example, this happens when the {\em autoloader} |
237 |
(see Section~\ref{sec:autoload}) has been used before saving an ML |
(see Section~\ref{sec:autoload}) has been invoked and the session |
238 |
session via {\tt SMLofNJ.exportML}. Under a correct installation, CM |
(including CM's internal state) is then saved via {\tt |
239 |
will now be able to resume such a session even when operating in a |
SMLofNJ.exportML}. CM is now able to resume such a session even when |
240 |
different environment, perhaps on a different machine with different |
operating in a different environment, perhaps on a different machine |
241 |
file system mounts, or a different location of the SML/NJ |
with different file system mounted, or a different location of the |
242 |
installation. |
SML/NJ installation. |
243 |
|
|
244 |
For this, CM provides a configurable mechanism for locating file |
For this, CM provides a configurable mechanism for locating file |
245 |
system objects. Moreover, it invokes this mechanism as late as |
system objects. Moreover, it invokes this mechanism as late as |
288 |
name in such a way that it remembers the corresponding working |
name in such a way that it remembers the corresponding working |
289 |
directory. Should the working directory change during an ongoing CM |
directory. Should the working directory change during an ongoing CM |
290 |
session while there still is a reference to the name, then CM will |
session while there still is a reference to the name, then CM will |
291 |
switch its mode of operation prepend the path of the original working |
switch its mode of operation and prepend the path of the original |
292 |
directory. As a result, two names specified using identical |
working directory. As a result, two names specified using identical |
293 |
strings but with different working directories in effect will be kept |
strings but at different times when different working directories were |
294 |
distinct and continue to refer to those file system location that they |
in effect will be kept distinct and continue to refer to the file |
295 |
referred to when they were first seen. |
system location that they referred to when they were first seen. |
296 |
\end{itemize} |
\end{itemize} |
297 |
|
|
298 |
\subsection{Anchor configuration} |
\subsection{Anchor configuration} |
306 |
CM.setAnchor}, {\tt CM.cancelAnchor}, and {\tt CM.resetPathConfig} |
CM.setAnchor}, {\tt CM.cancelAnchor}, and {\tt CM.resetPathConfig} |
307 |
(see Section~\ref{sec:api}). |
(see Section~\ref{sec:api}). |
308 |
|
|
309 |
The default location of the installation specific configuration file |
The default location of the installation-specific configuration file |
310 |
is {\tt /usr/lib/smlnj-pathconfig}. However, normally this default |
is {\tt /usr/lib/smlnj-pathconfig}. However, normally this default |
311 |
gets replaced (via an environment variable named {\tt |
gets replaced (via an environment variable named {\tt |
312 |
CM\_PATHCONFIG\_DEFAULT}) at installation time by a path pointing to |
CM\_PATHCONFIG\_DEFAULT}) at installation time by a path pointing to |
351 |
structure named {\tt CM}. This structure itself is exported from a |
structure named {\tt CM}. This structure itself is exported from a |
352 |
library called {\tt host-cm.cm}. Other libraries can exploit CM's |
library called {\tt host-cm.cm}. Other libraries can exploit CM's |
353 |
functionality simply by putting a {\tt host-cm.cm} entry into their |
functionality simply by putting a {\tt host-cm.cm} entry into their |
354 |
own description files. Section~\ref{sec:dynlink} shows one |
own description file. Section~\ref{sec:dynlink} shows one |
355 |
interesting used of this feature. |
interesting used of this feature. |
356 |
|
|
357 |
The library is pre-registered at the interactive prompt which is more |
The {\tt host-cm.cm} library is pre-registered at the interactive |
358 |
than a mere convenience: Structure {\tt CM} must be known a-priori at |
prompt. This is more than a mere convenience: structure {\tt CM} must |
359 |
top level because otherwise there would also be no way to make it known. |
already be known a-priori at top level because otherwise there would |
360 |
|
also be no way to make it known. |
361 |
|
|
362 |
Here is a description of the structure's members: |
Here is a description of the structure's members: |
363 |
|
|
395 |
\subsubsection*{Linking} |
\subsubsection*{Linking} |
396 |
|
|
397 |
In SML/NJ, linking means executing top-level code of each compilation |
In SML/NJ, linking means executing top-level code of each compilation |
398 |
unit. The resulting bindings can then be bound at the interactive top |
unit. The resulting bindings can then be registered at the interactive top |
399 |
level. |
level. |
400 |
|
|
401 |
\begin{verbatim} |
\begin{verbatim} |
476 |
startup time by the environment variable {\tt CM\_WARN\_OBSOLETE}. |
startup time by the environment variable {\tt CM\_WARN\_OBSOLETE}. |
477 |
|
|
478 |
{\tt CM.debug} can be used to turn on debug mode. This currently has |
{\tt CM.debug} can be used to turn on debug mode. This currently has |
479 |
no effect since there is no debug code in the implementation. The |
the effect of dumping a trace of the master-slave protocol for |
480 |
default is {\em false} and can be overriden at startup time by the |
parallel and distributed compilation (see Section~\ref{sec:parmake}) |
481 |
environment variable {\tt CM\_DEBUG}. |
to TextIO.stdOut. The default is {\em false} and can be overriden at |
482 |
|
startup time by the environment variable {\tt CM\_DEBUG}. |
483 |
|
|
484 |
\subsubsection*{Path anchors} |
\subsubsection*{Path anchors} |
485 |
|
|
531 |
|
|
532 |
Some care is necessary as {\tt CM.symval} does not check whether the |
Some care is necessary as {\tt CM.symval} does not check whether the |
533 |
syntax of the argument string is valid. (However, the worst thing |
syntax of the argument string is valid. (However, the worst thing |
534 |
that could happen is that the name of a variable cannot be written out |
that could happen is that there is no way of naming the variable |
535 |
in CM's description files and that, therefore, the associated value |
within CM's description files, and that, therefore, the associated |
536 |
cannot be queried.) |
value cannot be accessed.) |
537 |
|
|
538 |
\subsubsection*{Status inspection} |
\subsubsection*{Status inspection} |
539 |
|
|
546 |
|
|
547 |
{\tt CM.showPending} lists to standard output the names of all symbols |
{\tt CM.showPending} lists to standard output the names of all symbols |
548 |
which are currently registered as being bound at top level via the |
which are currently registered as being bound at top level via the |
549 |
autoloading mechanism and which so far have not actually been |
autoloading mechanism but so far have not actually been resolved. |
|
resolved. |
|
550 |
|
|
551 |
{\tt CM.listLibs} lists to standard output the path names of library |
{\tt CM.listLibs} lists to standard output the path names of library |
552 |
description files for those stable libraries that are currently known |
description files for those stable libraries that are currently known |
589 |
pre-loaded libraries. It may be a useful tool for determining the |
pre-loaded libraries. It may be a useful tool for determining the |
590 |
amount of space taken up by the internal state, though. |
amount of space taken up by the internal state, though. |
591 |
|
|
592 |
|
\subsubsection*{Compile servers} |
593 |
|
|
594 |
|
On Unix-like systems, CM supports parallel compilation. For computers |
595 |
|
connected using a LAN, this can be extended to distributed compilation |
596 |
|
using a network file system and the operating system's ``rsh'' |
597 |
|
facility. For a detailed discussion, see Section~\ref{sec:parmake}. |
598 |
|
|
599 |
|
\begin{verbatim} |
600 |
|
val server_start : |
601 |
|
{ name: string, |
602 |
|
cmd: string * string list, |
603 |
|
pathtrans: (string -> string) option, |
604 |
|
pref: int } -> bool |
605 |
|
val server_stop : string -> unit |
606 |
|
val server_kill : string -> unit |
607 |
|
\end{verbatim} |
608 |
|
|
609 |
|
CM is put into ``parallel'' mode by attaching at least one compile |
610 |
|
server. Compile servers are attached using invocations of {\tt |
611 |
|
CM.server\_start}. The function takes the name of the server (as an |
612 |
|
arbitrary but unique string) ({\tt name}), the Unix command used to |
613 |
|
start the server in a form suitable as an argument to {\tt |
614 |
|
Unix.execute} ({\tt cmd}), an optional ``path transformation |
615 |
|
function'' for converting local absolute path names to remote absolute |
616 |
|
pathnames ({\tt pathtrans}), and a numeric ``preference'' value that |
617 |
|
is used to choose servers at times when more than one is idle ({\tt |
618 |
|
pref}). The boolean result indicates success or failure of the attach |
619 |
|
operation. |
620 |
|
|
621 |
|
An existing server can be shut down and detached using either {\tt |
622 |
|
CM.server\_stop} or {\tt CM.server\_kill}. The string argument in |
623 |
|
either case must match the {\tt name} argument of the corresponding |
624 |
|
{\tt server\_start} call. Function {\tt server\_stop} uses CM's |
625 |
|
master-slave protocol to instruct the server to shut down gracefully. |
626 |
|
Only if this fails it may become necessary to use {\tt server\_kill} |
627 |
|
which will send a Unix KILL signal to destroy the server. |
628 |
|
|
629 |
\subsection{The autoloader} |
\subsection{The autoloader} |
630 |
\label{sec:autoload} |
\label{sec:autoload} |
631 |
|
|
632 |
From the user's point of view, a call to {\tt CM.autoload} acts very |
From the user's point of view, a call to {\tt CM.autoload} acts very |
633 |
much like the corresponding call ot {\tt CM.make} because the same |
much like the corresponding call to {\tt CM.make} because the same |
634 |
bindings that {\tt CM.make} would introduce into the top-level |
bindings that {\tt CM.make} would introduce into the top-level |
635 |
enviroment are also introduced by {\tt CM.autoload}. However, most |
enviroment are also introduced by {\tt CM.autoload}. However, most |
636 |
work will be deferred until some code entered later at the interactive |
work will be deferred until some code that is entered later refers to |
637 |
top level refers to one or more of these bindings. Only then will CM |
one or more of these bindings. Only then will CM go and perform just |
638 |
go and perform just the minimal work necessary to provide the actual |
the minimal work necessary to provide the actual definitions. |
639 |
definitions. |
|
640 |
|
The autoloader plays a central role for the interactive system. |
641 |
In this version of CM the autoloader plays a central role. Unlike |
Unlike in earlier versions, it cannot be turned off since it provides |
642 |
before, it cannot be turned off since it provides many of the standard |
many of the standard pre-defined top-level bindings. |
|
pre-defined top-level bindings in the interactive system. |
|
643 |
|
|
644 |
The autoloader is a convenient tool for virtually ``loading'' an |
The autoloader is a convenient tool for virtually ``loading'' an |
645 |
entire library without incurring an undue increase in memory |
entire library without incurring an undue increase in memory |
648 |
\subsection{Sharing of state} |
\subsection{Sharing of state} |
649 |
\label{sec:sharing} |
\label{sec:sharing} |
650 |
|
|
651 |
By default, CM tries to let multiple invocations of {\tt CM.make} or |
Whenever it is legal, CM lets multiple invocations of {\tt CM.make} or |
652 |
{\tt CM.autoload} share dynamic state created by link-time effects. |
{\tt CM.autoload} share dynamic state created by link-time effects. |
653 |
Of course, this is not possible if the compilation unit in question |
Of course, sharing is not possible (and hence not ``legal'') if the |
654 |
has recently been recompiled or depends on another compilation unit |
compilation unit in question has recently been recompiled or depends |
655 |
whose code has recently been re-executed. The programmer can |
on another compilation unit whose code has recently been re-executed. |
656 |
explicitly mark certain ML files as {\em shared}, in which case CM |
The programmer can explicitly mark certain ML files as {\em shared}, |
657 |
will issue a warning whenever the unit's code has to be re-executed. |
in which case CM will issue a warning whenever the unit's code has to |
658 |
|
be re-executed. |
659 |
|
|
660 |
State created by compilation units marked as {\em private} is never |
State created by compilation units marked as {\em private} is never |
661 |
shared across multiple calls to {\tt CM.make} or {\tt CM.autoload}. |
shared across multiple calls to {\tt CM.make} or {\tt CM.autoload}. |
778 |
\subsection{CM variables} |
\subsection{CM variables} |
779 |
\label{sec:cmvars} |
\label{sec:cmvars} |
780 |
|
|
781 |
CM provides a number of names that stand for certain integers. The |
CM provides a number of ``variables'' (names that stand for certain |
782 |
exact set of provided variable names depends on SML/NJ version number, |
integers). These variables may appear in expressions of the |
783 |
machine architecture, and operating system. A reference to a CM |
conditional-compilation facility. The exact set of provided variable |
784 |
variable is considered an arithmetic expression. If the variable is |
names depends on SML/NJ version number, machine architecture, and |
785 |
not defined, then it evaluates to 0. The expression {\tt |
operating system. A reference to a CM variable is considered an |
786 |
defined}($v$) is a boolean expression that yields true if and only if |
arithmetic expression. If the variable is not defined, then it |
787 |
$v$ is a defined CM variable. |
evaluates to 0. The expression {\tt defined}($v$) is a boolean |
788 |
|
expression that yields true if and only if $v$ is a defined CM |
789 |
|
variable. |
790 |
|
|
791 |
The names of CM variables are formed starting with a letter followed |
The names of CM variables are formed starting with a letter followed |
792 |
by zero or more occurences of letters, decimal digits, apostrophes, or |
by zero or more occurences of letters, decimal digits, apostrophes, or |
853 |
Here, the file {\tt bar-client.sml} gets included if {\tt |
Here, the file {\tt bar-client.sml} gets included if {\tt |
854 |
SMLNJ\_VERSION} is greater than 110 and {\tt new-foo.sml} exports a |
SMLNJ\_VERSION} is greater than 110 and {\tt new-foo.sml} exports a |
855 |
structure {\tt Bar} {\em or} if {\tt SMLNJ\_VERSION <= 110} and {\tt |
structure {\tt Bar} {\em or} if {\tt SMLNJ\_VERSION <= 110} and {\tt |
856 |
old-foo.sml} exports structure {\tt Bar}. \\ Otherwise {\tt |
old-foo.sml} exports structure {\tt Bar}. \\ |
857 |
no-bar-so-far.sml} gets included instead. In addition, the export of |
Otherwise {\tt no-bar-so-far.sml} gets included instead. In addition, |
858 |
structure {\tt Bar} is guarded by its own existence. (Structure {\tt |
the export of structure {\tt Bar} is guarded by its own existence. |
859 |
Bar} could also be defined by {\tt no-bar-so-far.sml} in which case it |
(Structure {\tt Bar} could also be defined by {\tt no-bar-so-far.sml} |
860 |
would get exported regardless of the outcome of the other {\tt |
in which case it would get exported regardless of the outcome of the |
861 |
defined} test.) |
other {\tt defined} test.) |
862 |
|
|
863 |
\subsection{Explicit errors} |
\subsection{Explicit errors} |
864 |
|
|
879 |
\nt{sym} &\ar& \nt{letter} \{\nt{ldau}\} \\ |
\nt{sym} &\ar& \nt{letter} \{\nt{ldau}\} \\ |
880 |
\\ |
\\ |
881 |
\nt{aatom} &\ar& \nt{number} \vb \nt{sym} \vb \tl{(} \nt{asum} \tl{)} \vb \tl{$\tilde{~}$} \nt{aatom} \\ |
\nt{aatom} &\ar& \nt{number} \vb \nt{sym} \vb \tl{(} \nt{asum} \tl{)} \vb \tl{$\tilde{~}$} \nt{aatom} \\ |
882 |
\nt{aprod} &\ar& \{\nt{aprod} (\tl{*} \vb \tl{div} \vb \tl{mod})\} \nt{aatom} \\ |
\nt{aprod} &\ar& \{\nt{aatom} (\tl{*} \vb \tl{div} \vb \tl{mod})\} \nt{aatom} \\ |
883 |
\nt{asum} &\ar& \{\nt{asum} (\tl{+} \vb \tl{-})\} \nt{aprod} \\ |
\nt{asum} &\ar& \{\nt{aprod} (\tl{+} \vb \tl{-})\} \nt{aprod} \\ |
884 |
\\ |
\\ |
885 |
\nt{ns} &\ar& \tl{structure} \vb \tl{signature} \vb \tl{functor} \vb \tl{funsig} \\ |
\nt{ns} &\ar& \tl{structure} \vb \tl{signature} \vb \tl{functor} \vb \tl{funsig} \\ |
886 |
\nt{mlsym} &\ar& {\em a Standard ML identifier} \\ |
\nt{mlsym} &\ar& {\em a Standard ML identifier} \\ |
891 |
\nt{batom} &\ar& \nt{query} \vb \nt{acmp} \vb \tl{not} \nt{batom} \vb \tl{(} \nt{bdisj} \tl{)} \\ |
\nt{batom} &\ar& \nt{query} \vb \nt{acmp} \vb \tl{not} \nt{batom} \vb \tl{(} \nt{bdisj} \tl{)} \\ |
892 |
\nt{bcmp} &\ar& \nt{batom} [(\ttl{=} \vb \ttl{<>}) \nt{batom}] \\ |
\nt{bcmp} &\ar& \nt{batom} [(\ttl{=} \vb \ttl{<>}) \nt{batom}] \\ |
893 |
\nt{bconj} &\ar& \{\nt{bcmp} \tl{andalso}\} \nt{bcmp} \\ |
\nt{bconj} &\ar& \{\nt{bcmp} \tl{andalso}\} \nt{bcmp} \\ |
894 |
\nt{bdisj} &\ar& \{\nt{bconj} \tl{orelse}\} \nt{bdisj} \\ |
\nt{bdisj} &\ar& \{\nt{bconj} \tl{orelse}\} \nt{bconj} \\ |
895 |
\\ |
\\ |
896 |
\nt{expression} &\ar& \nt{bdisj} |
\nt{expression} &\ar& \nt{bdisj} |
897 |
\end{tabular} |
\end{tabular} |
900 |
\label{sec:access} |
\label{sec:access} |
901 |
|
|
902 |
The basic idea behind CM's access control is the following: In their |
The basic idea behind CM's access control is the following: In their |
903 |
description files groups and libraries can specify a list of |
description files, groups and libraries can specify a list of |
904 |
{\em privileges} that the client must have in order to be able to use it. |
{\em privileges} that the client must have in order to be able to use them. |
905 |
Privileges at this level are just names (strings) and must be written |
Privileges at this level are just names (strings) and must be written |
906 |
in front of the initial keyword {\tt Library} or {\tt Group}. If one |
in front of the initial keyword {\tt Library} or {\tt Group}. If one |
907 |
group or library imports from another group or library, then |
group or library imports from another group or library, then |
910 |
for all its libraries, sub-libraries and library components, |
for all its libraries, sub-libraries and library components, |
911 |
components of sub-libraries, and so on. |
components of sub-libraries, and so on. |
912 |
|
|
913 |
Of course, this alone would not yet be satisfactory because there |
Of course, this alone would not yet be satisfactory. The main service |
914 |
should also be the possibility of setting up a ``safety wall:'' a |
of the access control system is that it can let a client use an |
915 |
library {\tt LSafe.cm} could ``wrap'' all the unsafe operations in |
``unsafe'' library ``safely''. For example, a library {\tt LSafe.cm} |
916 |
{\tt LUnsafe.cm} with enough error checking that they become safe. |
could ``wrap'' all the unsafe operations in {\tt LUnsafe.cm} with |
917 |
Therefore, a user of {\tt LSafe.cm} should not also be required to |
enough error checking that they become safe. Therefore, a user of |
918 |
possess the privileges that would be required if one were to use {\tt |
{\tt LSafe.cm} should not also be required to possess the privileges |
919 |
LUnsafe.cm} directly. |
that would be required if one were to use {\tt LUnsafe.cm} directly. |
920 |
|
|
921 |
In CM's access control model it is possible for a library to ``wrap'' |
In CM's access control model it is possible for a library to ``wrap'' |
922 |
privileges. If a privilege $P$ has been wrapped, then the user of the |
privileges. If a privilege $P$ has been wrapped, then the user of the |
969 |
(see Section~\ref{sec:access}), i.e., a privilege that goes by the |
(see Section~\ref{sec:access}), i.e., a privilege that goes by the |
970 |
same name as the module itself. |
same name as the module itself. |
971 |
|
|
972 |
Currently, the following primitive module names are known: {\tt |
User programs are strongly discouraged to access primitive modules |
973 |
built-in}, {\tt print-hook}, {\tt use-hook}, {\tt exn-info-hook}, {\tt |
directly. The exact set of primitive modules known to CM is subject to |
974 |
core}, {\tt init-utils}. |
change without notice. |
975 |
|
|
976 |
\section{Files} |
\section{Files} |
977 |
|
|
988 |
\item {\it Binfiles} are the SML/NJ equivalent of object files. They |
\item {\it Binfiles} are the SML/NJ equivalent of object files. They |
989 |
contain executable code and a symbol table for the associated ML |
contain executable code and a symbol table for the associated ML |
990 |
source file. |
source file. |
991 |
\item {\it Library files} (sometimes called: {\em stablefiles}) |
\item {\it Library files} (sometimes called: {\em stablefiles}) contain |
992 |
dependency graph, executable code, and symbol tables for an entire CM |
dependency graph, executable code, and symbol tables for an entire CM |
993 |
library including all of its components (groups). |
library including all of its components (groups). |
994 |
\end{enumerate} |
\end{enumerate} |
1068 |
sml = true } |
sml = true } |
1069 |
\end{verbatim} |
\end{verbatim} |
1070 |
|
|
1071 |
This code can either by packaged as a CM library or entered at the |
This code can either be packaged as a CM library or entered at the |
1072 |
interactive top level after loading the {\tt cm-tools.cm} library |
interactive top level after loading the {\tt cm-tools.cm} library |
1073 |
(e.g., via {\tt CM.autoload}). |
(e.g., via {\tt CM.autoload}). |
1074 |
|
|
1075 |
The call to {\tt Tools.newCmdGetterSetter} makes a `command |
The call to {\tt Tools.newCmdGetterSetter} makes a ``command |
1076 |
getter-setter'' which is a value of type {\tt \{ get: unit -> string, |
getter-setter'' (a value of type {\tt \{ get: unit -> string, |
1077 |
set: string -> unit \} }. It can be invoked to query or set the |
set: string -> unit \} }). It can be used to query or set the |
1078 |
command string for the tool. Here, the default string is {\tt |
command string for the tool. Here, the default string is {\tt |
1079 |
new-ml-yacc} and can be customized at startup time using the |
new-ml-yacc} and can be customized at startup time using the |
1080 |
environment variable {\tt CM\_NYACC}. |
environment variable {\tt CM\_NYACC}. |
1114 |
Less common kinds of rules can also be defined using the generic |
Less common kinds of rules can also be defined using the generic |
1115 |
interface {\tt Tools.registerClass}. |
interface {\tt Tools.registerClass}. |
1116 |
|
|
1117 |
|
\section{Parallel and distributed compilation} |
1118 |
|
\label{sec:parmake} |
1119 |
|
|
1120 |
|
To speed up recompilation of large projects with many ML source files, |
1121 |
|
CM can exploit parallelism that is inherent in the dependency graph. |
1122 |
|
Currently, the only kind of operating system for which this is |
1123 |
|
implemented is Unix ({\tt OPSYS\_UNIX}), where separate processes are |
1124 |
|
used. From there, one can distribute the work across a network of |
1125 |
|
machines by taking advantage of the network file system and the |
1126 |
|
``rsh'' facility. |
1127 |
|
|
1128 |
|
To perform parallel compilations, one must attach ``compile servers'' to |
1129 |
|
CM. This is done using function {\tt CM.server\_start} with the following |
1130 |
|
signature: |
1131 |
|
|
1132 |
|
\begin{verbatim} |
1133 |
|
val server_start : |
1134 |
|
{ name: string, |
1135 |
|
cmd: string * string list, |
1136 |
|
pathtrans: (string -> string) option, |
1137 |
|
pref: int } -> bool |
1138 |
|
\end{verbatim} |
1139 |
|
|
1140 |
|
|
1141 |
|
Here, {\tt name} is a string uniquely identifying the server and {\tt |
1142 |
|
cmd} is a value suitable as argument to {\tt Unix.execute}. |
1143 |
|
|
1144 |
|
The program to be specified by {\tt cmd} should be another instance of |
1145 |
|
CM---running in ``slave mode''. To start CM in slave mode, start {\tt |
1146 |
|
sml} with a single command-line argument of {\tt @CMslave}. For |
1147 |
|
example, if you have installed in /path/to/smlnj/bin/sml, then a |
1148 |
|
server process on the local machine could be started by |
1149 |
|
|
1150 |
|
\begin{verbatim} |
1151 |
|
CM.server_start { name = "A", pathtrans = NONE, pref = 0, |
1152 |
|
cmd = ("/path/to/smlnj/bin/sml", |
1153 |
|
["@CMslave"]) }; |
1154 |
|
\end{verbatim} |
1155 |
|
|
1156 |
|
To run a process on a remote machine, e.g., ``thatmachine'', as |
1157 |
|
compute server, one can use ``rsh''. Unfortunately, at the moment it |
1158 |
|
is necessary to specify the full path to ``rsh'' because {\tt |
1159 |
|
Unix.execute} (and therefore {\tt CM.server\_start}) does not perform |
1160 |
|
a {\tt PATH} search. The remote machine |
1161 |
|
must share the file system with the local machine, for example via NFS. |
1162 |
|
|
1163 |
|
\begin{verbatim} |
1164 |
|
CM.server_start { name = "thatmachine", |
1165 |
|
pathtrans = NONE, pref = 0, |
1166 |
|
cmd = ("/usr/ucb/rsh", |
1167 |
|
["thatmachine", |
1168 |
|
"/path/to/smlnj/bin/sml", |
1169 |
|
"@CMslave"]) }; |
1170 |
|
\end{verbatim} |
1171 |
|
|
1172 |
|
You can start as many servers as you want, but they all must have |
1173 |
|
different names. If you attach any servers at all, then you should |
1174 |
|
attach at least two (unless you want to attach one that runs on a |
1175 |
|
machine vastly more powerful than your local one). Local servers make |
1176 |
|
sense on multi-CPU machines: start as many servers as there are CPUs. |
1177 |
|
(Be careful, though. Since there is no memory-sharing to speak of |
1178 |
|
between separate instances of {\tt sml}, you should be sure to check |
1179 |
|
that your machine has enough main memory.) |
1180 |
|
|
1181 |
|
If servers on machines of different power are attached, one can give |
1182 |
|
some preference to faster ones by setting the {\tt pref} value higher. |
1183 |
|
(But since the {\tt pref} value is consulted only in the rare case |
1184 |
|
that more than one server is idle, this will rarely lead to vastly |
1185 |
|
better throughput.) All attached servers must use the same |
1186 |
|
architecture-OS combination as the controlling machine. |
1187 |
|
|
1188 |
|
The {\tt pathtrans} argument is used when connecting to a machine with |
1189 |
|
a different file-system layout. |
1190 |
|
For local servers, it can safely be left at {\tt NONE}. The ``path |
1191 |
|
transformation'' function is used to translate local absolute path |
1192 |
|
names to their remote counterparts. |
1193 |
|
This can be a bit tricky to get right, especially if the |
1194 |
|
machines use automounters or similar devices. |
1195 |
|
|
1196 |
|
Once servers have been attached, one can invoke functions like |
1197 |
|
{\tt CM.recomp}, {\tt CM.make}, and {\tt CM.stabilize}. They should |
1198 |
|
work the way the always do, but during compilation they will take |
1199 |
|
advantage of parallelism. |
1200 |
|
|
1201 |
|
When CM is interrupted using Control-C (or such), one will sometimes |
1202 |
|
experience a certain delay if servers are currently attached and busy. |
1203 |
|
This is because the interrupt-handling code will wait for the servers |
1204 |
|
to finish what they are currently doing and bring them back to an |
1205 |
|
``idle'' state first. |
1206 |
|
|
1207 |
|
\subsection*{Parallel bootstrap compilation} |
1208 |
|
|
1209 |
|
The bootstrap compiler\footnote{otherwise not mentioned in this |
1210 |
|
document} with its functions {\tt CMB.make}, {\tt CMB.deliver}, and |
1211 |
|
the corresponding cross-compilation variants of the bootstrap compiler |
1212 |
|
will also use any attached compile servers. If one intends to |
1213 |
|
exclusively use the bootstrap compiler, one can even attach servers |
1214 |
|
that run on machines with different architecture or operating system. |
1215 |
|
|
1216 |
|
Since the master-slave protocol is fairly simple, it cannot handle |
1217 |
|
complicated scenarios such as the one necessary for setting up the |
1218 |
|
initial (pervasive) environment during {\tt CMB.make}. Therefore, |
1219 |
|
this will always be done locally by the master process. |
1220 |
|
|
1221 |
\section{Example: Dynamic linking} |
\section{Example: Dynamic linking} |
1222 |
\label{sec:dynlink} |
\label{sec:dynlink} |
1223 |
|
|
1224 |
Autoloading is convenient and avoids wasted memory for modules that |
Autoloading is convenient and avoids wasted memory for modules that |
1225 |
have not been mentioned yet. However, sometimes one wants to be more |
should be available at the interactive prompt but have not actually |
1226 |
|
been used so far. However, sometimes one wants to be even more |
1227 |
aggressive and save the space needed for a function until---at |
aggressive and save the space needed for a function until---at |
1228 |
runtime---that function is actually being dynamically invoked. |
runtime---that function is actually being dynamically invoked. |
1229 |
|
|