Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/NOTES/HISTORY
ViewVC logotype

Diff of /sml/trunk/NOTES/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1115, Sun Mar 3 18:16:31 2002 UTC revision 1171, Fri Mar 22 20:41:43 2002 UTC
# Line 13  Line 13 
13  Description:  Description:
14    
15  ----------------------------------------------------------------------  ----------------------------------------------------------------------
16    Name: Lal George
17    Date: 2002/03/22 14:18:25 EST
18    Tag: george-20020322-cps-branch-prob
19    Description:
20    
21    Implemented the Ball-Larus branch prediction-heuristics, and
22    incorporated graphical viewers for control flow graphs.
23    
24    Ball-Larus Heuristics:
25    ---------------------
26    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
27    
28    By design it uses the Dempster-Shafer theory for combining
29    probabilities.  For example, in the function:
30    
31        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
32    
33    the ball-larus heuristics predicts that the n=0 is unlikely
34    (OH-heuristic), and the 'then' branch is unlikely because of the
35    RH-heuristic -- giving the 'then' branch an even lower combined
36    probability using the Dempster-Shafer theory.
37    
38    Finally, John Reppy's loop analysis in MLRISC, further lowers the
39    probability of the 'then' branch because of the loop in the else
40    branch.
41    
42    
43    Graphical Viewing:
44    ------------------
45    I merely plugged in Allen's graphical viewers into the compiler. The
46    additional code is not much. At the top level, saying:
47    
48            Control.MLRISC.getFlag "cfg-graphical-view" := true;
49    
50    will display the graphical view of the control flow graph just before
51    back-patching.  daVinci must be in your path for this to work. If
52    daVinci is not available, then the default viewer can be changed
53    using:
54    
55            Control.MLRISC.getString "viewer"
56    
57    which can be set to "dot" or "vcg" for the corresponding viewers. Of
58    course, these viewers must be in your path.
59    
60    The above will display the compilation unit at the level of clusters,
61    many of which are small, boring, and un-interesting. Also setting:
62    
63            Control.MLRISC.getInt "cfg-graphical-view_size"
64    
65    will display clusters that are larger than the value set by the above.
66    
67    
68    ----------------------------------------------------------------------
69    Name: Matthias Blume
70    Date: 2002/03/21 22:20:00 EST
71    Tag: blume-20020321-kmp-bugfix
72    Description:
73    
74    Changed the interface to the KMP routine in PreString and fixed
75    a minor bug in one place where it was used.
76    
77    ----------------------------------------------------------------------
78    Name: Allen Leung
79    Date: 2002/03/21 20:30:00 EST
80    Tag: leunga-20020321-cfg
81    Description:
82    
83      Fixed a potential problem in cfg edge splitting.
84    
85    ----------------------------------------------------------------------
86    Name: Allen Leung
87    Date: 2002/03/21 17:15:00 EST
88    Tag: leunga-20020321-x86-fp-cfg
89    Description:
90    
91      1. Recoded the buggy parts of x86-fp.
92    
93         a. All the block reordering code has been removed.
94            We now depend on the block placement phases to do this work.
95    
96         b. Critical edge splitting code has been simplified and moved into the
97            CFG modules, as where they belong.
98    
99         Both of these were quite buggy and complex.  The code is now much, much
100         simpler.
101    
102      2. X86 backend.
103    
104         a. Added instructions for 64-bit support.  Instruction selection for
105            64-bit has not been committed, however, since that
106            requires changes to MLTREE which haven't been approved by
107            Lal and John.
108    
109         b. Added support for FUCOMI and FUCOMIP when generating code for
110            PentiumPro and above.  We only generate these instructions in
111            the fast-fp mode.
112    
113         c. Added cases for JP and JNP in X86FreqProps.
114    
115      3. CFG
116    
117         CFG now has a bunch of methods for edge splitting and merging.
118    
119      4. Machine description.
120    
121         John's simplification of MLTREE_BASIS.fcond broke a few machine
122         description things:
123    
124         rtl-build.{sig,sml} and hppa.mdl fixed.
125    
126         NOTE: the machine description stuff in the repository is still broken.
127               Again, I can't put my fixes in because that involves
128               changes to MLTREE.
129    
130    ----------------------------------------------------------------------
131    Name: Matthias Blume
132    Date: 2002/03/20 15:55:00 EST
133    Tag: blume-20020320-kmp
134    Description:
135    
136    Implemented Knuth-Morris-Pratt string matching in PreString and used
137    it for String.isSubstring, Substring.isSubstring, and
138    Substring.position.
139    
140    (Might need some stress-testing.  Simple examples worked fine.)
141    
142    ----------------------------------------------------------------------
143    Name: Matthias Blume
144    Date: 2002/03/19 16:37:00 EST
145    Tag: blume-20020319-witnesses
146    Description:
147    
148    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
149    
150    This implements a generic mechanism for changing constness qualifiers
151    anywhere within big C types without resorting to outright "casts".
152    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
153    modify the constness at the outermost level.)
154    The implementation of "convert" is based on the idea of "witness"
155    values -- values that are not used by the operation but whose types
156    "testify" to their applicability.  On the implementation side, "convert"
157    is simply a projection (returning its second curried argument).  With
158    cross-module inlining, it should not result in any machine code being
159    generated.
160    
161    ----------------------------------------------------------------------
162    Name: Matthias Blume
163    Date: 2002/03/15 16:40:00 EST
164    Tag: blume-20020315-basis
165    Description:
166    
167    Provided (preliminary?) implementations for
168    
169      {String,Substring}.{concatWith,isSuffix,isSubstring}
170    
171    and
172    
173      Substring.full
174    
175    Those are in the Basis spec but they were missing in SML/NJ.
176    
177    ----------------------------------------------------------------------
178    Name: Matthias Blume
179    Date: 2002/03/14 21:30:00 EST
180    Tag: blume-20020314-controls
181    Description:
182    
183    Controls:
184    ---------
185    
186    1. Factored out the recently-added Controls : CONTROLS stuff and put
187       it into its own library $/controls-lib.cm.  The source tree for
188       this is under src/smlnj-lib/Controls.
189    
190    2. Changed the names of types and functions in this interface, so they
191       make a bit more "sense":
192    
193          module -> registry
194          'a registry -> 'a group
195    
196    3. The interface now deals in ref cells only.  The getter/setter interface
197       is (mostly) gone.
198    
199    4. Added a function that lets one register an already-existing ref cell.
200    
201    5. Made the corresponding modifications to the rest of the code so that
202       everything compiles again.
203    
204    6. Changed the implementation of Controls.MLRISC back to something closer
205       to the original.  In particular, this module (and therefore MLRISC)
206       does not depend on Controls.  There now is some link-time code in
207       int-sys.sml that registers the MLRISC controls with the Controls
208       module.
209    
210    CM:
211    ---
212    
213      * One can now specify the lambda-split aggressiveness in init.cmi.
214    
215    ----------------------------------------------------------------------
216    Name: Allen Leung
217    Date: 2002/03/13 17:30:00 EST
218    Tag: leunga-20020313-x86-fp-unary
219    Description:
220    
221    Bug fix for:
222    
223    > leunga@weaselbane:~/Yale/tmp/sml-dist{21} bin/sml
224    > Standard ML of New Jersey v110.39.1 [FLINT v1.5], March 08, 2002
225    > - fun f(x,(y,z)) = Real.~ y;
226    > [autoloading]
227    > [autoloading done]
228    >       fchsl   (%eax), 184(%esp)
229    > Error: MLRisc bug: X86MCEmitter.emitInstr
230    >
231    > uncaught exception Error
232    >   raised at: ../MLRISC/control/mlriscErrormsg.sml:16.14-16.19
233    
234    The problem was that the code generator did not generate any fp registers
235    in this case, and the ra didn't know that it needed to run the X86FP phase to
236    translate the pseudo fp instruction.   This only happened with unary fp
237    operators in certain situations.
238    
239    ----------------------------------------------------------------------
240    Name: Matthias Blume
241    Date: 2002/03/13 14:00:00 EST
242    Tag: blume-20020313-overload-etc
243    Description:
244    
245    1. Added _overload as a synonym for overload for backward compatibility.
246       (Control.overloadKW must be true for either version to be accepted.)
247    
248    2. Fixed bug in install script that caused more things to be installed
249       than what was requested in config/targets.
250    
251    3. Made CM aware of the (_)overload construct so that autoloading
252       works.
253    
254    ----------------------------------------------------------------------
255    Name: Matthias Blume
256    Date: 2002/03/12 22:03:00 EST
257    Tag: blume-20020312-url
258    Description:
259    
260    Forgot to update BOOT and srcarchiveurl.
261    
262    ----------------------------------------------------------------------
263    Name: Matthias Blume
264    Date: 2002/03/12 17:30:00 EST
265    Tag: blume-20020312-version110392
266    Description:
267    
268    Yet another version number bump (because of small changes to the
269    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
270    
271    Changes:
272    
273      The new pid generation scheme described a few weeks ago was overly
274      complicated.  I implemented a new mechanism that is simpler and
275      provides a bit more "stability":  Once CM has seen a compilation
276      unit, it keeps its identity constant (as long as you do not delete
277      those crucial CM/GUID/* files).  This means that when you change
278      an interface, compile, then go back to the old interface, and
279      compile again, you arrive at the original pid.
280    
281      There now also is a mechanism that instructs CM to use the plain
282      environment hash as a module's pid (effectively making its GUID
283      the empty string).  For this, "noguid" must be specified as an
284      option to the .sml file in question within its .cm file.
285      This is most useful for code that is being generated by tools such
286      as ml-nlffigen (because during development programmers tend to
287      erase the tool's entire output directory tree including CM's cached
288      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
289      revert to the old, broken behavior of SML/NJ, but in specific cases
290      where there is no danger of interface confusion, its use is ok
291      (I think).
292    
293      ml-nlffigen by default generates "noguid" annotations.  They can be
294      turned off by specifying -guid in its command line.
295    
296    ----------------------------------------------------------------------
297    Name: Lal George
298    Date: 2002/03/12 12 14:42:36 EST
299    Tag: george-20020312-frequency-computation
300    Description:
301    
302    Integrated jump chaining and static block frequency into the
303    compiler. More details and numbers later.
304    
305    ----------------------------------------------------------------------
306    Name: Lal George
307    Date: 2002/03/11 11 22:38:53 EST
308    Tag: george-20020311-jump-chain-elim
309    Description:
310    
311    Tested the jump chain elimination on all architectures (except the
312    hppa).  This is on by default right now and is profitable for the
313    alpha and x86, however, it may not be profitable for the sparc and ppc
314    when compiling the compiler.
315    
316    The gc test will typically jump to a label at the end of the cluster,
317    where there is another jump to an external cluster containing the actual
318    code to invoke gc. This is to allow factoring of common gc invocation
319    sequences. That is to say, we generate:
320    
321            f:
322               testgc
323               ja   L1      % jump if above to L1
324    
325            L1:
326               jmp L2
327    
328    
329    After jump chain elimination the 'ja L1' instructions is converted to
330    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
331    up being implemented in their long form (if L2 is far away) using:
332    
333            jbe     L3      % jump if below or equal to L3
334            jmp     L2
335         L3:
336            ...
337    
338    
339    For large compilation units L2  may be far away.
340    
341    
342    ----------------------------------------------------------------------
343    Name: Matthias Blume
344    Date: 2002/03/11 13:30:00 EST
345    Tag: blume-20020311-mltreeeval
346    Description:
347    
348    A functor parameter was missing.
349    
350    ----------------------------------------------------------------------
351    Name: Allen Leung
352    Date: 2002/03/11 10:30:00 EST
353    Tag: leunga-20020311-runtime-string0
354    Description:
355    
356       The representation of the empty string now points to a
357    legal null terminated C string instead of unit.  It is now possible
358    to convert an ML string into C string with InlineT.CharVector.getData.
359    This compiles into one single machine instruction.
360    
361    ----------------------------------------------------------------------
362    Name: Allen Leung
363    Date: 2002/03/10 23:55:00 EST
364    Tag: leunga-20020310-x86-call
365    Description:
366    
367       Added machine generation for CALL instruction (relative displacement mode)
368    
369    ----------------------------------------------------------------------
370    Name: Matthias Blume
371    Date: 2002/03/08 16:05:00
372    Tag: blume-20020308-entrypoints
373    Description:
374    
375    Version number bumped to 110.39.1.  NEW BOOTFILES!
376    
377    Entrypoints: non-zero offset into a code object where execution should begin.
378    
379    - Added the notion of an entrypoint to CodeObj.
380    - Added reading/writing of entrypoint info to Binfile.
381    - Made runtime system bootloader aware of entrypoints.
382    - Use the address of the label of the first function given to mlriscGen
383      as the entrypoint.  This address is currently always 0, but it will
384      not be 0 once we turn on block placement.
385    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
386      with entry points) from mlriscGen.
387    
388    ----------------------------------------------------------------------
389    Name: Allen Leung
390    Date: 2002/03/07 20:45:00 EST
391    Tag: leunga-20020307-x86-cmov
392    Description:
393    
394       Bug fixes for CMOVcc on x86.
395    
396       1. Added machine code generation for CMOVcc
397       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
398       3. CMOVcc cannot have an immediate operand as argument.
399    
400    ----------------------------------------------------------------------
401    Name: Matthias Blume
402    Date: 2002/03/07 16:15:00 EST
403    Tag: blume-20020307-controls
404    Description:
405    
406    This is a very large but mostly boring patch which makes (almost)
407    every tuneable compiler knob (i.e., pretty much everything under
408    Control.* plus a few other things) configurable via both the command
409    line and environment variables in the style CM did its configuration
410    until now.
411    
412    Try starting sml with '-h' (or, if you are brave, '-H')
413    
414    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
415    implements the underlying generic mechanism.
416    
417    The interface to some of the existing such facilities has changed somewhat.
418    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
419    (The getFoo interface is still there for backward-compatibility, but its
420    use is deprecated.)
421    
422    The ml-build script passes -Cxxx=yyy command-line arguments through so
423    that one can now twiddle the compiler settings when using this "batch"
424    compiler.
425    
426    TODO items:
427    
428    We should go through and throw out all controls that are no longer
429    connected to anything.  Moreover, we should go through and provide
430    meaningful (and correct!) documentation strings for those controls
431    that still are connected.
432    
433    Currently, multiple calls to Controls.new are accepted (only the first
434    has any effect).  Eventually we should make sure that every control
435    is being made (via Controls.new) exactly once.  Future access can then
436    be done using Controls.acc.
437    
438    Finally, it would probably be a good idea to use the getter-setter
439    interface to controls rather than ref cells.  For the time being, both
440    styles are provided by the Controls module, but getter-setter pairs are
441    better if thread-safety is of any concern because they can be wrapped.
442    
443    *****************************************
444    
445    One bug fix: The function blockPlacement in three of the MLRISC
446    backpatch files used to be hard-wired to one of two possibilities at
447    link time (according to the value of the placementFlag).  But (I
448    think) it should rather sense the flag every time.
449    
450    *****************************************
451    
452    Other assorted changes (by other people who did not supply a HISTORY entry):
453    
454    1. the cross-module inliner now works much better (Monnier)
455    2. representation of weights, frequencies, and probabilities in MLRISC
456       changed in preparation of using those for weighted block placement
457       (Reppy, George)
458    
459    ----------------------------------------------------------------------
460    Name: Lal George
461    Date: 2002/03/07 14:44:24 EST 2002
462    Tag: george-20020307-weighted-block-placement
463    
464    Tested the weighted block placement optimization on all architectures
465    (except the hppa) using AMPL to generate the block and edge frequencies.
466    Changes were required in the machine properties to correctly
467    categorize trap instructions. There is an MLRISC flag
468    "weighted-block-placement" that can be used to enable weighted block
469    placement, but this will be ineffective without block/edge
470    frequencies (coming soon).
471    
472    
473    ----------------------------------------------------------------------
474    Name: Lal George
475    Date: 2002/03/05 17:24:48 EST
476    Tag: george-20020305-linkage-cluster
477    
478    In order to support the block placement optimization, a new cluster
479    is generated as the very first cluster (called the linkage cluster).
480    It contains a single jump to the 'real' entry point for the compilation
481    unit. Block placement has no effect on the linkage cluster itself, but
482    all the other clusters  have full freedom in the manner in which they
483    reorder blocks or functions.
484    
485    On the x86 the typical linkage code that is generated is:
486       ----------------------
487            .align 2
488       L0:
489            addl    $L1-L0, 72(%esp)
490            jmp     L1
491    
492    
493            .align  2
494       L1:
495       ----------------------
496    
497    72(%esp) is the memory location for the stdlink register. This
498    must contain the address of the CPS function being called. In the
499    above example, it contains the address of  L0; before
500    calling L1 (the real entry point for the compilation unit), it
501    must contain the address for L1, and hence
502    
503            addl $L1-L0, 72(%esp)
504    
505    I have tested this on all architectures except the hppa.The increase
506    in code size is of course negligible
507    
508    ----------------------------------------------------------------------
509  Name: Allen Leung  Name: Allen Leung
510  Date: 2002/03/03 13:20:00 EST  Date: 2002/03/03 13:20:00 EST
511  Tag: leunga-20020303-mlrisc-tools  Tag: leunga-20020303-mlrisc-tools

Legend:
Removed from v.1115  
changed lines
  Added in v.1171

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0