SCM Repository
Annotation of /sml/trunk/src/cm/Overview
Parent Directory
|
Revision Log
Revision 338 - (view) (download)
1 : | blume | 338 | OVERVIEW |
2 : | -------- | ||
3 : | |||
4 : | The operation of CM can be understood best by looking at its most | ||
5 : | central datastructure: the dependency graph. You can find the | ||
6 : | definitions of its associated types in depend/graph.sml. There is | ||
7 : | also a coarse-grain "group graph" data structure. Its definition is | ||
8 : | in depend/ggraph.sml. | ||
9 : | |||
10 : | One can roughly divide CM into front-end and back-end. It is the | ||
11 : | front-end's responsibility to establish the dependency graph for a | ||
12 : | given project. The back-end implements various ways of traversing the | ||
13 : | graph, thereby performing the operations that the user expects: | ||
14 : | consistency checking, recompilation, linking, stabilization, | ||
15 : | generation of listings or other statistics, etc. | ||
16 : | |||
17 : | The central component of the front-end is the parser. It builds the | ||
18 : | dependency graph incrementally with help from the dependency analyzer. | ||
19 : | |||
20 : | * Analysis MUST be performed incrementally because the parser | ||
21 : | occasionally wants to know what the exported symbols of sub-groups and | ||
22 : | sub-libraries are. (This is required for the parser's conditional | ||
23 : | compilation facility.) | ||
24 : | |||
25 : | * Analysis CAN be performed incrementally because the sub-graphs that | ||
26 : | correspond to sub-groups or sub-libraries are independent of how they | ||
27 : | are being used. | ||
28 : | |||
29 : | The dependency analyzer must inspect the ML source code of the | ||
30 : | project. Within CM, handling of ML source code is centralized -- all | ||
31 : | information pertaining to one ML source file is bundled as an abstract | ||
32 : | data type (SmlInfo.info). You find the definition (and the | ||
33 : | implementation) of that type in smlfile/smlinfo.sml. In particular, | ||
34 : | one important optimization that saves many repeated invocations of | ||
35 : | the compiler's parser is to strip the ML abstract syntax tree from all | ||
36 : | unnecessary (as far as CM is concerned) information and store the | ||
37 : | "compressed" version in some sort of cache. I call such compressed ML | ||
38 : | syntax information a "skeleton". You find the definition of the | ||
39 : | skeleton type in smlfile/skeleton.sml. Associated code is in the same | ||
40 : | directory. | ||
41 : | |||
42 : | The dependency analyzer operates on skeletons. Its implementation can | ||
43 : | be found in depend/build.sml. | ||
44 : | |||
45 : | |||
46 : | PRIVILEGES (access control) | ||
47 : | --------------------------- | ||
48 : | |||
49 : | The basic mechanisms for access control are implemented: CM can | ||
50 : | correctly detect which "privileges" would be required to access groups | ||
51 : | and libraries. However, nothing has been done to actually enforce | ||
52 : | anything. In other words, everybody is assumed to have every possible | ||
53 : | privilege. CM merely reports which privileges "would have been | ||
54 : | required". For the time being this is not really critical. | ||
55 : | Enforcement must be tied into some form of OS-specific enforcement | ||
56 : | mechanism (such as Unix file permissions or something similar), and I | ||
57 : | haven't really thought out a way of doing this nicely and cleanly. | ||
58 : | |||
59 : | The basic idea behind CM's "privileges" is quite easy to understand. | ||
60 : | In their description files groups and libraries can specify a list of | ||
61 : | privileges that the user of such a group/library must possess in order | ||
62 : | to be able to use it. Privileges at this level are just names | ||
63 : | (strings). If one group/library imports from another group/library, | ||
64 : | then privileges are being inherited. In effect, to be able to use a | ||
65 : | program, one must have all privileges for all its libraries/groups, | ||
66 : | sub-libraries/groups, sub-sub-libraries/groups, etc. | ||
67 : | |||
68 : | Of course, this is not yet satisfactory because there should also be | ||
69 : | the possibility of setting up a "safety wall": a library LSafe.cm | ||
70 : | could "wrap" all the unsafe operations in LUnsafe.cm with enough error | ||
71 : | checking that they become safe. Therefore, a user of LSafe.cm should | ||
72 : | not also be required to possess the privileges that would be required | ||
73 : | if one were to use LUnsafe.cm directly. | ||
74 : | |||
75 : | To this end, in CM's model of privileges it is possible for a | ||
76 : | group/library to "wrap" privileges. If a privilege P is wrapped, then | ||
77 : | the user of the library does not need to have privilege P even though | ||
78 : | the library is using another library that requires privilege P. In | ||
79 : | essence, the library acts as a "proxy" who provides the necessary | ||
80 : | privilege P to the sub-library. | ||
81 : | |||
82 : | Of course, not everybody can be allowed to establish a library with | ||
83 : | such a "wrapped" privilege P. The programmer who does that should at | ||
84 : | least herself have privilege P (but perhaps better, she should have | ||
85 : | "permission to wrap P" -- a stronger requirement). | ||
86 : | |||
87 : | In CM, wrapping a privilege is done by specifying the name of that | ||
88 : | privilege within parenthesis. The wrapping becomes active once the | ||
89 : | library gets "stabilized" (see below). The (not yet implemented) | ||
90 : | enforcement mechanism must ensure that anyone who stabilizes a library | ||
91 : | that wraps P has permission to wrap P. | ||
92 : | (In CM's source code and comments, "wrapped" privileges are referred | ||
93 : | to as "granted" privileges -- which doesn't quite seem to capture the | ||
94 : | actual meaning.) | ||
95 : | |||
96 : | |||
97 : | STABILIZATION | ||
98 : | ------------- | ||
99 : | |||
100 : | Aside from the issues concerning privileges, stabilization is a way of | ||
101 : | putting an entire pre-compiled library -- together with its | ||
102 : | pre-computed dependency graph -- into one single container. Once this | ||
103 : | is done, CM will never need to have access to the original ML source | ||
104 : | code. Before actually consulting the description file for a | ||
105 : | group/library, the parser will always check and see if there is a | ||
106 : | stable container. If so, it will suck the dependency graph out of the | ||
107 : | container and be done. | ||
108 : | |||
109 : | Because of ML's "open" feature, it sometimes is necessary for the | ||
110 : | dependency analyzer of a group to consult the contents (i.e., the | ||
111 : | definitions) of signatures, structures, or functors that are imported | ||
112 : | from sub-groups/libraries. Since the pre-computed dependency graph | ||
113 : | does not contain such information, it will then become necessary to | ||
114 : | recover it in a different way. | ||
115 : | |||
116 : | Remember, the ML source code shouldn't have to be available at this | ||
117 : | point. However, the same information is contained in the static | ||
118 : | environment that is stored in every "binfile". (The binfile is the | ||
119 : | result of compiling one ML source file. It contains executable code | ||
120 : | and a pickled representation of the static environment that is | ||
121 : | exported from the compilation unit.) Aside from the dependency graph, | ||
122 : | the container for a stabilized group/library also stores all the | ||
123 : | associated binfiles. | ||
124 : | |||
125 : | Loading (stable) binfiles for the purpose of dependency analysis is | ||
126 : | sometimes necessary, but since it is expensive we do it as seldom as | ||
127 : | we can (i.e., lazily). The implementation of this mechanism (which is | ||
128 : | really just a hook into the actual implementation provided by | ||
129 : | GenericVC) is in depend/se2dae.sml. (See the comments there.) It is | ||
130 : | used in stable/stabilize.sml. (Look for "cvtMemo"!) | ||
131 : | |||
132 : | Information pertaining to members of stabilized groups/libraries is | ||
133 : | managed by the abstract datatype BinInfo.info (see | ||
134 : | stable/bininfo.sml). In some sense, BinInfo.info is to stabilized ML | ||
135 : | code what SmlInfo.info is to not-yet-stabilized ML code. | ||
136 : | |||
137 : | |||
138 : | DEPENDENCY GRAPH | ||
139 : | ---------------- | ||
140 : | |||
141 : | The division into non-stabilized and stabilized groups/libraries is | ||
142 : | clearly visible in the definition of the types that make up dependency | ||
143 : | graphs. There are "BNODE"s that mention BinInfo.info and there are | ||
144 : | "SNODE"s that mention SmlInfo.info. (There are also "PNODE"s that | ||
145 : | facilitate access to "primitive" internal environments that have to do | ||
146 : | with bootstrapping.) | ||
147 : | |||
148 : | You will notice that one can never go from a BNODE to an SNODE. This | ||
149 : | mirrors our intention that a subgroup of a stabilized group must also | ||
150 : | be stabilized. From SNODEs, on the other hand, you can either go to | ||
151 : | other SNODEs or to BNODEs. All the "localimports" of an SNODE (i.e., | ||
152 : | the imports that come from the same group) are also SNODEs. To go to | ||
153 : | a BNODE one must look into the list of "globalimport"s. Global | ||
154 : | imports refer to "far" nodes -- nodes that are within other groups. | ||
155 : | The edge that goes to such a node can have an export filter attached. | ||
156 : | Therefore, a farbnode is a bnode with an optional filter, a farsbnode | ||
157 : | is either a BNODE or an SNODE with an optional filter attached. | ||
158 : | |||
159 : | Imports and exports of a group are represented by "impexp"s. Impexps | ||
160 : | are essentially just farsbnodes, but they also contain the dependency | ||
161 : | analyzers "analysis environment" which contains information about the | ||
162 : | actual definition (contents) of exported structures/functors. As said | ||
163 : | earlier, this is necessary to handle the "open" construct of ML. | ||
164 : | |||
165 : | The exports of a group are then simply a mapping from exported symbols | ||
166 : | to corresponding impexps. (See depend/ggraph.sml.) | ||
167 : | |||
168 : | |||
169 : | RECOMPILATION AND EXECUTION | ||
170 : | --------------------------- | ||
171 : | |||
172 : | There is a generic traversal routine that is used to implement both | ||
173 : | recompilation traversals and execution (link-) traversals | ||
174 : | (compile/generic.sml). The decision of which kind of traversal is | ||
175 : | implemented comes from the functor argument: the "compilation type". | ||
176 : | A signature describing compilation types abstractly is in | ||
177 : | compile/compile-type.sml. In essence, it provides compilation | ||
178 : | environments and associated operations abstractly. | ||
179 : | |||
180 : | Concrete instantiations of this signature are in compile/recomp.sml | ||
181 : | and in compile/exec.sml. As you will see, these are also implemented | ||
182 : | as functors parameterized by an abstraction of "persistent state". | ||
183 : | Persistent state is used to remember the results of traversals from | ||
184 : | invocation to invocation of CM. This avoids needless recompilation in | ||
185 : | the case of recomp.sml and facilitates sharing of dynamic values in | ||
186 : | the case of exec.sml. (However, the two cases are otherwise quite | ||
187 : | dissimilar.) | ||
188 : | |||
189 : | Persistent state comes in two varieties: "recomp" and "full". Full | ||
190 : | state is actually an extension of recomp state and can also be used | ||
191 : | where recomp state is expected. The "normal" CM uses full state | ||
192 : | because it implements both recompilation and execution. The same | ||
193 : | state is passed to both ExecFn and RecompFn, so it will be properly | ||
194 : | shared by recompilation and execution traversals. In the case of the | ||
195 : | bootstrap compiler, however, we never actually execute the code that | ||
196 : | comes out of the compiler. (The code will be executed by the runtime | ||
197 : | system when bootstrapping.) Therefore, for the bootstrap compiler we | ||
198 : | don't use full state but simply recomp state. (If we cross-compile | ||
199 : | for a different architecture we could not possibly execute the code | ||
200 : | anyway.) |
root@smlnj-gforge.cs.uchicago.edu | ViewVC Help |
Powered by ViewVC 1.0.0 |