Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Annotation of /sml/trunk/src/cm/Overview
ViewVC logotype

Annotation of /sml/trunk/src/cm/Overview

Parent Directory Parent Directory | Revision Log Revision Log


Revision 338 - (view) (download)

1 : blume 338 OVERVIEW
2 :     --------
3 :    
4 :     The operation of CM can be understood best by looking at its most
5 :     central datastructure: the dependency graph. You can find the
6 :     definitions of its associated types in depend/graph.sml. There is
7 :     also a coarse-grain "group graph" data structure. Its definition is
8 :     in depend/ggraph.sml.
9 :    
10 :     One can roughly divide CM into front-end and back-end. It is the
11 :     front-end's responsibility to establish the dependency graph for a
12 :     given project. The back-end implements various ways of traversing the
13 :     graph, thereby performing the operations that the user expects:
14 :     consistency checking, recompilation, linking, stabilization,
15 :     generation of listings or other statistics, etc.
16 :    
17 :     The central component of the front-end is the parser. It builds the
18 :     dependency graph incrementally with help from the dependency analyzer.
19 :    
20 :     * Analysis MUST be performed incrementally because the parser
21 :     occasionally wants to know what the exported symbols of sub-groups and
22 :     sub-libraries are. (This is required for the parser's conditional
23 :     compilation facility.)
24 :    
25 :     * Analysis CAN be performed incrementally because the sub-graphs that
26 :     correspond to sub-groups or sub-libraries are independent of how they
27 :     are being used.
28 :    
29 :     The dependency analyzer must inspect the ML source code of the
30 :     project. Within CM, handling of ML source code is centralized -- all
31 :     information pertaining to one ML source file is bundled as an abstract
32 :     data type (SmlInfo.info). You find the definition (and the
33 :     implementation) of that type in smlfile/smlinfo.sml. In particular,
34 :     one important optimization that saves many repeated invocations of
35 :     the compiler's parser is to strip the ML abstract syntax tree from all
36 :     unnecessary (as far as CM is concerned) information and store the
37 :     "compressed" version in some sort of cache. I call such compressed ML
38 :     syntax information a "skeleton". You find the definition of the
39 :     skeleton type in smlfile/skeleton.sml. Associated code is in the same
40 :     directory.
41 :    
42 :     The dependency analyzer operates on skeletons. Its implementation can
43 :     be found in depend/build.sml.
44 :    
45 :    
46 :     PRIVILEGES (access control)
47 :     ---------------------------
48 :    
49 :     The basic mechanisms for access control are implemented: CM can
50 :     correctly detect which "privileges" would be required to access groups
51 :     and libraries. However, nothing has been done to actually enforce
52 :     anything. In other words, everybody is assumed to have every possible
53 :     privilege. CM merely reports which privileges "would have been
54 :     required". For the time being this is not really critical.
55 :     Enforcement must be tied into some form of OS-specific enforcement
56 :     mechanism (such as Unix file permissions or something similar), and I
57 :     haven't really thought out a way of doing this nicely and cleanly.
58 :    
59 :     The basic idea behind CM's "privileges" is quite easy to understand.
60 :     In their description files groups and libraries can specify a list of
61 :     privileges that the user of such a group/library must possess in order
62 :     to be able to use it. Privileges at this level are just names
63 :     (strings). If one group/library imports from another group/library,
64 :     then privileges are being inherited. In effect, to be able to use a
65 :     program, one must have all privileges for all its libraries/groups,
66 :     sub-libraries/groups, sub-sub-libraries/groups, etc.
67 :    
68 :     Of course, this is not yet satisfactory because there should also be
69 :     the possibility of setting up a "safety wall": a library LSafe.cm
70 :     could "wrap" all the unsafe operations in LUnsafe.cm with enough error
71 :     checking that they become safe. Therefore, a user of LSafe.cm should
72 :     not also be required to possess the privileges that would be required
73 :     if one were to use LUnsafe.cm directly.
74 :    
75 :     To this end, in CM's model of privileges it is possible for a
76 :     group/library to "wrap" privileges. If a privilege P is wrapped, then
77 :     the user of the library does not need to have privilege P even though
78 :     the library is using another library that requires privilege P. In
79 :     essence, the library acts as a "proxy" who provides the necessary
80 :     privilege P to the sub-library.
81 :    
82 :     Of course, not everybody can be allowed to establish a library with
83 :     such a "wrapped" privilege P. The programmer who does that should at
84 :     least herself have privilege P (but perhaps better, she should have
85 :     "permission to wrap P" -- a stronger requirement).
86 :    
87 :     In CM, wrapping a privilege is done by specifying the name of that
88 :     privilege within parenthesis. The wrapping becomes active once the
89 :     library gets "stabilized" (see below). The (not yet implemented)
90 :     enforcement mechanism must ensure that anyone who stabilizes a library
91 :     that wraps P has permission to wrap P.
92 :     (In CM's source code and comments, "wrapped" privileges are referred
93 :     to as "granted" privileges -- which doesn't quite seem to capture the
94 :     actual meaning.)
95 :    
96 :    
97 :     STABILIZATION
98 :     -------------
99 :    
100 :     Aside from the issues concerning privileges, stabilization is a way of
101 :     putting an entire pre-compiled library -- together with its
102 :     pre-computed dependency graph -- into one single container. Once this
103 :     is done, CM will never need to have access to the original ML source
104 :     code. Before actually consulting the description file for a
105 :     group/library, the parser will always check and see if there is a
106 :     stable container. If so, it will suck the dependency graph out of the
107 :     container and be done.
108 :    
109 :     Because of ML's "open" feature, it sometimes is necessary for the
110 :     dependency analyzer of a group to consult the contents (i.e., the
111 :     definitions) of signatures, structures, or functors that are imported
112 :     from sub-groups/libraries. Since the pre-computed dependency graph
113 :     does not contain such information, it will then become necessary to
114 :     recover it in a different way.
115 :    
116 :     Remember, the ML source code shouldn't have to be available at this
117 :     point. However, the same information is contained in the static
118 :     environment that is stored in every "binfile". (The binfile is the
119 :     result of compiling one ML source file. It contains executable code
120 :     and a pickled representation of the static environment that is
121 :     exported from the compilation unit.) Aside from the dependency graph,
122 :     the container for a stabilized group/library also stores all the
123 :     associated binfiles.
124 :    
125 :     Loading (stable) binfiles for the purpose of dependency analysis is
126 :     sometimes necessary, but since it is expensive we do it as seldom as
127 :     we can (i.e., lazily). The implementation of this mechanism (which is
128 :     really just a hook into the actual implementation provided by
129 :     GenericVC) is in depend/se2dae.sml. (See the comments there.) It is
130 :     used in stable/stabilize.sml. (Look for "cvtMemo"!)
131 :    
132 :     Information pertaining to members of stabilized groups/libraries is
133 :     managed by the abstract datatype BinInfo.info (see
134 :     stable/bininfo.sml). In some sense, BinInfo.info is to stabilized ML
135 :     code what SmlInfo.info is to not-yet-stabilized ML code.
136 :    
137 :    
138 :     DEPENDENCY GRAPH
139 :     ----------------
140 :    
141 :     The division into non-stabilized and stabilized groups/libraries is
142 :     clearly visible in the definition of the types that make up dependency
143 :     graphs. There are "BNODE"s that mention BinInfo.info and there are
144 :     "SNODE"s that mention SmlInfo.info. (There are also "PNODE"s that
145 :     facilitate access to "primitive" internal environments that have to do
146 :     with bootstrapping.)
147 :    
148 :     You will notice that one can never go from a BNODE to an SNODE. This
149 :     mirrors our intention that a subgroup of a stabilized group must also
150 :     be stabilized. From SNODEs, on the other hand, you can either go to
151 :     other SNODEs or to BNODEs. All the "localimports" of an SNODE (i.e.,
152 :     the imports that come from the same group) are also SNODEs. To go to
153 :     a BNODE one must look into the list of "globalimport"s. Global
154 :     imports refer to "far" nodes -- nodes that are within other groups.
155 :     The edge that goes to such a node can have an export filter attached.
156 :     Therefore, a farbnode is a bnode with an optional filter, a farsbnode
157 :     is either a BNODE or an SNODE with an optional filter attached.
158 :    
159 :     Imports and exports of a group are represented by "impexp"s. Impexps
160 :     are essentially just farsbnodes, but they also contain the dependency
161 :     analyzers "analysis environment" which contains information about the
162 :     actual definition (contents) of exported structures/functors. As said
163 :     earlier, this is necessary to handle the "open" construct of ML.
164 :    
165 :     The exports of a group are then simply a mapping from exported symbols
166 :     to corresponding impexps. (See depend/ggraph.sml.)
167 :    
168 :    
169 :     RECOMPILATION AND EXECUTION
170 :     ---------------------------
171 :    
172 :     There is a generic traversal routine that is used to implement both
173 :     recompilation traversals and execution (link-) traversals
174 :     (compile/generic.sml). The decision of which kind of traversal is
175 :     implemented comes from the functor argument: the "compilation type".
176 :     A signature describing compilation types abstractly is in
177 :     compile/compile-type.sml. In essence, it provides compilation
178 :     environments and associated operations abstractly.
179 :    
180 :     Concrete instantiations of this signature are in compile/recomp.sml
181 :     and in compile/exec.sml. As you will see, these are also implemented
182 :     as functors parameterized by an abstraction of "persistent state".
183 :     Persistent state is used to remember the results of traversals from
184 :     invocation to invocation of CM. This avoids needless recompilation in
185 :     the case of recomp.sml and facilitates sharing of dynamic values in
186 :     the case of exec.sml. (However, the two cases are otherwise quite
187 :     dissimilar.)
188 :    
189 :     Persistent state comes in two varieties: "recomp" and "full". Full
190 :     state is actually an extension of recomp state and can also be used
191 :     where recomp state is expected. The "normal" CM uses full state
192 :     because it implements both recompilation and execution. The same
193 :     state is passed to both ExecFn and RecompFn, so it will be properly
194 :     shared by recompilation and execution traversals. In the case of the
195 :     bootstrap compiler, however, we never actually execute the code that
196 :     comes out of the compiler. (The code will be executed by the runtime
197 :     system when bootstrapping.) Therefore, for the bootstrap compiler we
198 :     don't use full state but simply recomp state. (If we cross-compile
199 :     for a different architecture we could not possibly execute the code
200 :     anyway.)

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0