SCM Repository
Annotation of /sml/trunk/src/cm/Overview
Parent Directory
|
Revision Log
Revision 348 - (view) (download)
1 : | blume | 338 | OVERVIEW |
2 : | -------- | ||
3 : | |||
4 : | The operation of CM can be understood best by looking at its most | ||
5 : | central datastructure: the dependency graph. You can find the | ||
6 : | definitions of its associated types in depend/graph.sml. There is | ||
7 : | also a coarse-grain "group graph" data structure. Its definition is | ||
8 : | in depend/ggraph.sml. | ||
9 : | |||
10 : | One can roughly divide CM into front-end and back-end. It is the | ||
11 : | front-end's responsibility to establish the dependency graph for a | ||
12 : | given project. The back-end implements various ways of traversing the | ||
13 : | graph, thereby performing the operations that the user expects: | ||
14 : | consistency checking, recompilation, linking, stabilization, | ||
15 : | generation of listings or other statistics, etc. | ||
16 : | |||
17 : | The central component of the front-end is the parser. It builds the | ||
18 : | dependency graph incrementally with help from the dependency analyzer. | ||
19 : | |||
20 : | * Analysis CAN be performed incrementally because the sub-graphs that | ||
21 : | correspond to sub-groups or sub-libraries are independent of how they | ||
22 : | are being used. | ||
23 : | |||
24 : | blume | 340 | * We DO perform analysis incrementally because the parser occasionally |
25 : | wants to know what the exported symbols of sub-groups and | ||
26 : | sub-libraries are. (This is required for the parser's conditional | ||
27 : | compilation facility.) While it would probably be possible to achieve | ||
28 : | this using a more cursory analysis, the extra effort of implementing | ||
29 : | it would definitely not be outweighed by any gains. | ||
30 : | |||
31 : | blume | 338 | The dependency analyzer must inspect the ML source code of the |
32 : | project. Within CM, handling of ML source code is centralized -- all | ||
33 : | information pertaining to one ML source file is bundled as an abstract | ||
34 : | data type (SmlInfo.info). You find the definition (and the | ||
35 : | implementation) of that type in smlfile/smlinfo.sml. In particular, | ||
36 : | one important optimization that saves many repeated invocations of | ||
37 : | the compiler's parser is to strip the ML abstract syntax tree from all | ||
38 : | unnecessary (as far as CM is concerned) information and store the | ||
39 : | "compressed" version in some sort of cache. I call such compressed ML | ||
40 : | syntax information a "skeleton". You find the definition of the | ||
41 : | skeleton type in smlfile/skeleton.sml. Associated code is in the same | ||
42 : | directory. | ||
43 : | |||
44 : | The dependency analyzer operates on skeletons. Its implementation can | ||
45 : | be found in depend/build.sml. | ||
46 : | |||
47 : | |||
48 : | PRIVILEGES (access control) | ||
49 : | --------------------------- | ||
50 : | |||
51 : | The basic mechanisms for access control are implemented: CM can | ||
52 : | correctly detect which "privileges" would be required to access groups | ||
53 : | and libraries. However, nothing has been done to actually enforce | ||
54 : | anything. In other words, everybody is assumed to have every possible | ||
55 : | privilege. CM merely reports which privileges "would have been | ||
56 : | required". For the time being this is not really critical. | ||
57 : | Enforcement must be tied into some form of OS-specific enforcement | ||
58 : | mechanism (such as Unix file permissions or something similar), and I | ||
59 : | haven't really thought out a way of doing this nicely and cleanly. | ||
60 : | |||
61 : | The basic idea behind CM's "privileges" is quite easy to understand. | ||
62 : | In their description files groups and libraries can specify a list of | ||
63 : | privileges that the user of such a group/library must possess in order | ||
64 : | to be able to use it. Privileges at this level are just names | ||
65 : | (strings). If one group/library imports from another group/library, | ||
66 : | then privileges are being inherited. In effect, to be able to use a | ||
67 : | program, one must have all privileges for all its libraries/groups, | ||
68 : | sub-libraries/groups, sub-sub-libraries/groups, etc. | ||
69 : | |||
70 : | Of course, this is not yet satisfactory because there should also be | ||
71 : | the possibility of setting up a "safety wall": a library LSafe.cm | ||
72 : | could "wrap" all the unsafe operations in LUnsafe.cm with enough error | ||
73 : | checking that they become safe. Therefore, a user of LSafe.cm should | ||
74 : | not also be required to possess the privileges that would be required | ||
75 : | if one were to use LUnsafe.cm directly. | ||
76 : | |||
77 : | To this end, in CM's model of privileges it is possible for a | ||
78 : | group/library to "wrap" privileges. If a privilege P is wrapped, then | ||
79 : | the user of the library does not need to have privilege P even though | ||
80 : | the library is using another library that requires privilege P. In | ||
81 : | essence, the library acts as a "proxy" who provides the necessary | ||
82 : | privilege P to the sub-library. | ||
83 : | |||
84 : | Of course, not everybody can be allowed to establish a library with | ||
85 : | such a "wrapped" privilege P. The programmer who does that should at | ||
86 : | least herself have privilege P (but perhaps better, she should have | ||
87 : | "permission to wrap P" -- a stronger requirement). | ||
88 : | |||
89 : | In CM, wrapping a privilege is done by specifying the name of that | ||
90 : | privilege within parenthesis. The wrapping becomes active once the | ||
91 : | library gets "stabilized" (see below). The (not yet implemented) | ||
92 : | enforcement mechanism must ensure that anyone who stabilizes a library | ||
93 : | that wraps P has permission to wrap P. | ||
94 : | |||
95 : | |||
96 : | STABILIZATION | ||
97 : | ------------- | ||
98 : | |||
99 : | Aside from the issues concerning privileges, stabilization is a way of | ||
100 : | putting an entire pre-compiled library -- together with its | ||
101 : | pre-computed dependency graph -- into one single container. Once this | ||
102 : | is done, CM will never need to have access to the original ML source | ||
103 : | blume | 348 | code. Before actually consulting the description file for a library, |
104 : | the parser will always check and see if there is a stable container. | ||
105 : | If so, it will suck the dependency graph out of the container and be | ||
106 : | done. | ||
107 : | blume | 338 | |
108 : | Because of ML's "open" feature, it sometimes is necessary for the | ||
109 : | dependency analyzer of a group to consult the contents (i.e., the | ||
110 : | definitions) of signatures, structures, or functors that are imported | ||
111 : | blume | 348 | from sub-libraries. Since the pre-computed dependency graph does not |
112 : | contain such information, it will then become necessary to recover it | ||
113 : | in a different way. | ||
114 : | blume | 338 | |
115 : | Remember, the ML source code shouldn't have to be available at this | ||
116 : | point. However, the same information is contained in the static | ||
117 : | environment that is stored in every "binfile". (The binfile is the | ||
118 : | result of compiling one ML source file. It contains executable code | ||
119 : | and a pickled representation of the static environment that is | ||
120 : | exported from the compilation unit.) Aside from the dependency graph, | ||
121 : | blume | 348 | the container for a stabilized library also stores all the associated |
122 : | binfiles. | ||
123 : | blume | 338 | |
124 : | Loading (stable) binfiles for the purpose of dependency analysis is | ||
125 : | sometimes necessary, but since it is expensive we do it as seldom as | ||
126 : | we can (i.e., lazily). The implementation of this mechanism (which is | ||
127 : | really just a hook into the actual implementation provided by | ||
128 : | GenericVC) is in depend/se2dae.sml. (See the comments there.) It is | ||
129 : | used in stable/stabilize.sml. (Look for "cvtMemo"!) | ||
130 : | |||
131 : | blume | 348 | Information pertaining to members of stabilized libraries is managed |
132 : | by the abstract datatype BinInfo.info (see stable/bininfo.sml). In | ||
133 : | some sense, BinInfo.info is to stabilized ML code what SmlInfo.info is | ||
134 : | to not-yet-stabilized ML code. | ||
135 : | blume | 338 | |
136 : | blume | 348 | By the way, only libraries can be stabilized. A stabilized library |
137 : | will encompass its own sources as well as the sources of sub-groups | ||
138 : | (and their sub-groups, and so on). Sub-libraries of the library, on | ||
139 : | the other hand, will be referred to symbolically (they do not get | ||
140 : | "sucked" in like groups do). In effect, sub-grouping of a library | ||
141 : | becomes convenient for resolving name-spacing issues without | ||
142 : | compromising the "one single container" paradigm of stable libraries. | ||
143 : | blume | 338 | |
144 : | blume | 348 | |
145 : | blume | 338 | DEPENDENCY GRAPH |
146 : | ---------------- | ||
147 : | |||
148 : | blume | 348 | The division into non-stabilized and stabilized libraries is clearly |
149 : | visible in the definition of the types that make up dependency graphs. | ||
150 : | There are "BNODE"s that mention BinInfo.info and there are "SNODE"s | ||
151 : | that mention SmlInfo.info. (There are also "PNODE"s that facilitate | ||
152 : | access to "primitive" internal environments that have to do with | ||
153 : | bootstrapping.) | ||
154 : | blume | 338 | |
155 : | You will notice that one can never go from a BNODE to an SNODE. This | ||
156 : | blume | 348 | mirrors our intention that a sub-library of a stabilized library must |
157 : | also be stabilized. From SNODEs, on the other hand, you can either go | ||
158 : | to other SNODEs or to BNODEs. All the "localimports" of an SNODE | ||
159 : | (i.e., the imports that come from the same group/library) are also | ||
160 : | SNODEs. To go to a BNODE one must look into the list of | ||
161 : | "globalimport"s. Global imports refer to "far" nodes -- nodes that | ||
162 : | are within other groups/libraries. The edge that goes to such a node | ||
163 : | can have an export filter attached. Therefore, a farbnode is a bnode | ||
164 : | with an optional filter, a farsbnode is either a BNODE or an SNODE | ||
165 : | with an optional filter attached. | ||
166 : | blume | 338 | |
167 : | blume | 348 | Imports and exports of a group/library are represented by "impexp"s. |
168 : | Impexps are essentially just farsbnodes, but they also contain the | ||
169 : | dependency analyzers "analysis environment" which contains information | ||
170 : | about the actual definition (contents) of exported | ||
171 : | structures/functors. As said earlier, this is necessary to handle the | ||
172 : | "open" construct of ML. | ||
173 : | blume | 338 | |
174 : | blume | 348 | The exports of a group/library are then simply represented by a |
175 : | mapping from exported symbols to corresponding impexps. (See | ||
176 : | depend/ggraph.sml.) | ||
177 : | blume | 338 | |
178 : | |||
179 : | RECOMPILATION AND EXECUTION | ||
180 : | --------------------------- | ||
181 : | |||
182 : | There is a generic traversal routine that is used to implement both | ||
183 : | recompilation traversals and execution (link-) traversals | ||
184 : | (compile/generic.sml). The decision of which kind of traversal is | ||
185 : | implemented comes from the functor argument: the "compilation type". | ||
186 : | A signature describing compilation types abstractly is in | ||
187 : | compile/compile-type.sml. In essence, it provides compilation | ||
188 : | environments and associated operations abstractly. | ||
189 : | |||
190 : | Concrete instantiations of this signature are in compile/recomp.sml | ||
191 : | and in compile/exec.sml. As you will see, these are also implemented | ||
192 : | as functors parameterized by an abstraction of "persistent state". | ||
193 : | Persistent state is used to remember the results of traversals from | ||
194 : | invocation to invocation of CM. This avoids needless recompilation in | ||
195 : | the case of recomp.sml and facilitates sharing of dynamic values in | ||
196 : | the case of exec.sml. (However, the two cases are otherwise quite | ||
197 : | dissimilar.) | ||
198 : | |||
199 : | Persistent state comes in two varieties: "recomp" and "full". Full | ||
200 : | state is actually an extension of recomp state and can also be used | ||
201 : | where recomp state is expected. The "normal" CM uses full state | ||
202 : | because it implements both recompilation and execution. The same | ||
203 : | state is passed to both ExecFn and RecompFn, so it will be properly | ||
204 : | shared by recompilation and execution traversals. In the case of the | ||
205 : | bootstrap compiler, however, we never actually execute the code that | ||
206 : | comes out of the compiler. (The code will be executed by the runtime | ||
207 : | system when bootstrapping.) Therefore, for the bootstrap compiler we | ||
208 : | don't use full state but simply recomp state. (If we cross-compile | ||
209 : | for a different architecture we could not possibly execute the code | ||
210 : | anyway.) |
root@smlnj-gforge.cs.uchicago.edu | ViewVC Help |
Powered by ViewVC 1.0.0 |