Home My Page Projects Code Snippets Project Openings SML/NJ
 Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

# SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
 [smlnj] / sml / trunk / HISTORY

# Diff of /sml/trunk/HISTORY

revision 1003, Fri Dec 7 02:45:32 2001 UTC revision 1168, Fri Mar 22 19:19:32 2002 UTC
# Line 11  Line 11
11  Date: yyyy/mm/dd  Date: yyyy/mm/dd
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14
15    ----------------------------------------------------------------------
16    Name: Lal George
17    Date: 2002/03/22 14:18:25 EST
18    Tag: blume-20020321-cps-branch-prob
19    Description:
20
21    Implemented the Ball-Larus branch prediction-heuristic, and
22    incorporated graphical viewers for control flow graphs.
23
24    Ball-Larus Heuristic:
25    ---------------------
26    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
27
28    By design it uses the Dempster-Shafer theory for combining
29    probabilities.  For example, in the function:
30
31        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
32
33    the ball-larus heuristics predicts that the n=0 is unlikely
34    (OH-heuristic), and the 'then' branch is unlikely because of the
35    RH-heuristic -- giving the 'then' branch an even lower combined
36    probability using the Dempster-Shater theory.
37
38    Finally, John Reppy's loop analysis in MLRISC, further lowers the
39    probability of the 'then' branch because of the loop in the else
40    branch.
41
42
43    Graphical Viewing:
44    ------------------
45    I merely plugged in Allen's graphical viewers into the compiler. The
46    additional code is not much. At the top level, saying:
47
48            Control.MLRISC.getFlag "cfg-graphical-view" := true;
49
50    will display the graphical view of the control flow graph just before
51    back-patching.  daVinci must be in your path for this to work. If
52    daVinci is not available, then the default viewer can be changed
53    using:
54
55            Control.MLRISC.getString "viewer"
56
57    which can be set to "dot" or "vcg" for the corresponding viewers. Of
58    course, these viewers must be in your path.
59
60    The above will display the compilation unit at the level of clusters,
61    many of which are small, boring, and un-interesting. Also setting:
62
63            Control.MLRISC.getInt "cfg-graphical-view_size"
64
65    will display clusters that are larger than the value set by the above.
66
67
68    ----------------------------------------------------------------------
69    Name: Matthias Blume
70    Date: 2002/03/21 22:20:00 EST
71    Tag: blume-20020321-kmp-bugfix
72    Description:
73
74    Changed the interface to the KMP routine in PreString and fixed
75    a minor bug in one place where it was used.
76
77    ----------------------------------------------------------------------
78    Name: Allen Leung
79    Date: 2002/03/21 20:30:00 EST
80    Tag: leunga-20020321-cfg
81    Description:
82
83      Fixed a potential problem in cfg edge splitting.
84
85    ----------------------------------------------------------------------
86    Name: Allen Leung
87    Date: 2002/03/21 17:15:00 EST
88    Tag: leunga-20020321-x86-fp-cfg
89    Description:
90
91      1. Recoded the buggy parts of x86-fp.
92
93         a. All the block reordering code has been removed.
94            We now depend on the block placement phases to do this work.
95
96         b. Critical edge splitting code has been simplified and moved into the
97            CFG modules, as where they belong.
98
99         Both of these were quite buggy and complex.  The code is now much, much
100         simpler.
101
102      2. X86 backend.
103
104         a. Added instructions for 64-bit support.  Instruction selection for
105            64-bit has not been committed, however, since that
106            requires changes to MLTREE which haven't been approved by
107            Lal and John.
108
109         b. Added support for FUCOMI and FUCOMIP when generating code for
110            PentiumPro and above.  We only generate these instructions in
111            the fast-fp mode.
112
113         c. Added cases for JP and JNP in X86FreqProps.
114
115      3. CFG
116
117         CFG now has a bunch of methods for edge splitting and merging.
118
119      4. Machine description.
120
121         John's simplification of MLTREE_BASIS.fcond broke a few machine
122         description things:
123
124         rtl-build.{sig,sml} and hppa.mdl fixed.
125
126         NOTE: the machine description stuff in the repository is still broken.
127               Again, I can't put my fixes in because that involves
128               changes to MLTREE.
129
130    ----------------------------------------------------------------------
131    Name: Matthias Blume
132    Date: 2002/03/20 15:55:00 EST
133    Tag: blume-20020320-kmp
134    Description:
135
136    Implemented Knuth-Morris-Pratt string matching in PreString and used
137    it for String.isSubstring, Substring.isSubstring, and
138    Substring.position.
139
140    (Might need some stress-testing.  Simple examples worked fine.)
141
142    ----------------------------------------------------------------------
143    Name: Matthias Blume
144    Date: 2002/03/19 16:37:00 EST
145    Tag: blume-20020319-witnesses
146    Description:
147
148    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
149
150    This implements a generic mechanism for changing constness qualifiers
151    anywhere within big C types without resorting to outright "casts".
152    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
153    modify the constness at the outermost level.)
154    The implementation of "convert" is based on the idea of "witness"
155    values -- values that are not used by the operation but whose types
156    "testify" to their applicability.  On the implementation side, "convert"
157    is simply a projection (returning its second curried argument).  With
158    cross-module inlining, it should not result in any machine code being
159    generated.
160
161    ----------------------------------------------------------------------
162    Name: Matthias Blume
163    Date: 2002/03/15 16:40:00 EST
164    Tag: blume-20020315-basis
165    Description:
166
167    Provided (preliminary?) implementations for
168
169      {String,Substring}.{concatWith,isSuffix,isSubstring}
170
171    and
172
173      Substring.full
174
175    Those are in the Basis spec but they were missing in SML/NJ.
176
177    ----------------------------------------------------------------------
178    Name: Matthias Blume
179    Date: 2002/03/14 21:30:00 EST
180    Tag: blume-20020314-controls
181    Description:
182
183    Controls:
184    ---------
185
186    1. Factored out the recently-added Controls : CONTROLS stuff and put
187       it into its own library \$/controls-lib.cm.  The source tree for
188       this is under src/smlnj-lib/Controls.
189
190    2. Changed the names of types and functions in this interface, so they
191       make a bit more "sense":
192
193          module -> registry
194          'a registry -> 'a group
195
196    3. The interface now deals in ref cells only.  The getter/setter interface
197       is (mostly) gone.
198
199    4. Added a function that lets one register an already-existing ref cell.
200
201    5. Made the corresponding modifications to the rest of the code so that
202       everything compiles again.
203
204    6. Changed the implementation of Controls.MLRISC back to something closer
205       to the original.  In particular, this module (and therefore MLRISC)
206       does not depend on Controls.  There now is some link-time code in
207       int-sys.sml that registers the MLRISC controls with the Controls
208       module.
209
210    CM:
211    ---
212
213      * One can now specify the lambda-split aggressiveness in init.cmi.
214
215    ----------------------------------------------------------------------
216    Name: Allen Leung
217    Date: 2002/03/13 17:30:00 EST
218    Tag: leunga-20020313-x86-fp-unary
219    Description:
220
221    Bug fix for:
222
223    > leunga@weaselbane:~/Yale/tmp/sml-dist{21} bin/sml
224    > Standard ML of New Jersey v110.39.1 [FLINT v1.5], March 08, 2002
225    > - fun f(x,(y,z)) = Real.~ y;
228    >       fchsl   (%eax), 184(%esp)
229    > Error: MLRisc bug: X86MCEmitter.emitInstr
230    >
231    > uncaught exception Error
232    >   raised at: ../MLRISC/control/mlriscErrormsg.sml:16.14-16.19
233
234    The problem was that the code generator did not generate any fp registers
235    in this case, and the ra didn't know that it needed to run the X86FP phase to
236    translate the pseudo fp instruction.   This only happened with unary fp
237    operators in certain situations.
238
239    ----------------------------------------------------------------------
240    Name: Matthias Blume
241    Date: 2002/03/13 14:00:00 EST
243    Description:
244
245    1. Added _overload as a synonym for overload for backward compatibility.
246       (Control.overloadKW must be true for either version to be accepted.)
247
248    2. Fixed bug in install script that caused more things to be installed
249       than what was requested in config/targets.
250
252       works.
253
254    ----------------------------------------------------------------------
255    Name: Matthias Blume
256    Date: 2002/03/12 22:03:00 EST
257    Tag: blume-20020312-url
258    Description:
259
260    Forgot to update BOOT and srcarchiveurl.
261
262    ----------------------------------------------------------------------
263    Name: Matthias Blume
264    Date: 2002/03/12 17:30:00 EST
265    Tag: blume-20020312-version110392
266    Description:
267
268    Yet another version number bump (because of small changes to the
269    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
270
271    Changes:
272
273      The new pid generation scheme described a few weeks ago was overly
274      complicated.  I implemented a new mechanism that is simpler and
275      provides a bit more "stability":  Once CM has seen a compilation
276      unit, it keeps its identity constant (as long as you do not delete
277      those crucial CM/GUID/* files).  This means that when you change
278      an interface, compile, then go back to the old interface, and
279      compile again, you arrive at the original pid.
280
281      There now also is a mechanism that instructs CM to use the plain
282      environment hash as a module's pid (effectively making its GUID
283      the empty string).  For this, "noguid" must be specified as an
284      option to the .sml file in question within its .cm file.
285      This is most useful for code that is being generated by tools such
286      as ml-nlffigen (because during development programmers tend to
287      erase the tool's entire output directory tree including CM's cached
288      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
289      revert to the old, broken behavior of SML/NJ, but in specific cases
290      where there is no danger of interface confusion, its use is ok
291      (I think).
292
293      ml-nlffigen by default generates "noguid" annotations.  They can be
294      turned off by specifying -guid in its command line.
295
296    ----------------------------------------------------------------------
297    Name: Lal George
298    Date: 2002/03/12 12 14:42:36 EST
299    Tag: george-20020312-frequency-computation
300    Description:
301
302    Integrated jump chaining and static block frequency into the
303    compiler. More details and numbers later.
304
305    ----------------------------------------------------------------------
306    Name: Lal George
307    Date: 2002/03/11 11 22:38:53 EST
308    Tag: george-20020311-jump-chain-elim
309    Description:
310
311    Tested the jump chain elimination on all architectures (except the
312    hppa).  This is on by default right now and is profitable for the
313    alpha and x86, however, it may not be profitable for the sparc and ppc
314    when compiling the compiler.
315
316    The gc test will typically jump to a label at the end of the cluster,
317    where there is another jump to an external cluster containing the actual
318    code to invoke gc. This is to allow factoring of common gc invocation
319    sequences. That is to say, we generate:
320
321            f:
322               testgc
323               ja   L1      % jump if above to L1
324
325            L1:
326               jmp L2
327
328
329    After jump chain elimination the 'ja L1' instructions is converted to
330    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
331    up being implemented in their long form (if L2 is far away) using:
332
333            jbe     L3      % jump if below or equal to L3
334            jmp     L2
335         L3:
336            ...
337
338
339    For large compilation units L2  may be far away.
340
341
342    ----------------------------------------------------------------------
343    Name: Matthias Blume
344    Date: 2002/03/11 13:30:00 EST
345    Tag: blume-20020311-mltreeeval
346    Description:
347
348    A functor parameter was missing.
349
350    ----------------------------------------------------------------------
351    Name: Allen Leung
352    Date: 2002/03/11 10:30:00 EST
353    Tag: leunga-20020311-runtime-string0
354    Description:
355
356       The representation of the empty string now points to a
357    legal null terminated C string instead of unit.  It is now possible
358    to convert an ML string into C string with InlineT.CharVector.getData.
359    This compiles into one single machine instruction.
360
361    ----------------------------------------------------------------------
362    Name: Allen Leung
363    Date: 2002/03/10 23:55:00 EST
364    Tag: leunga-20020310-x86-call
365    Description:
366
367       Added machine generation for CALL instruction (relative displacement mode)
368
369    ----------------------------------------------------------------------
370    Name: Matthias Blume
371    Date: 2002/03/08 16:05:00
372    Tag: blume-20020308-entrypoints
373    Description:
374
375    Version number bumped to 110.39.1.  NEW BOOTFILES!
376
377    Entrypoints: non-zero offset into a code object where execution should begin.
378
379    - Added the notion of an entrypoint to CodeObj.
380    - Added reading/writing of entrypoint info to Binfile.
381    - Made runtime system bootloader aware of entrypoints.
382    - Use the address of the label of the first function given to mlriscGen
383      as the entrypoint.  This address is currently always 0, but it will
384      not be 0 once we turn on block placement.
385    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
386      with entry points) from mlriscGen.
387
388    ----------------------------------------------------------------------
389    Name: Allen Leung
390    Date: 2002/03/07 20:45:00 EST
391    Tag: leunga-20020307-x86-cmov
392    Description:
393
394       Bug fixes for CMOVcc on x86.
395
396       1. Added machine code generation for CMOVcc
397       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
398       3. CMOVcc cannot have an immediate operand as argument.
399
400    ----------------------------------------------------------------------
401    Name: Matthias Blume
402    Date: 2002/03/07 16:15:00 EST
403    Tag: blume-20020307-controls
404    Description:
405
406    This is a very large but mostly boring patch which makes (almost)
407    every tuneable compiler knob (i.e., pretty much everything under
408    Control.* plus a few other things) configurable via both the command
409    line and environment variables in the style CM did its configuration
410    until now.
411
412    Try starting sml with '-h' (or, if you are brave, '-H')
413
414    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
415    implements the underlying generic mechanism.
416
417    The interface to some of the existing such facilities has changed somewhat.
418    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
419    (The getFoo interface is still there for backward-compatibility, but its
420    use is deprecated.)
421
422    The ml-build script passes -Cxxx=yyy command-line arguments through so
423    that one can now twiddle the compiler settings when using this "batch"
424    compiler.
425
426    TODO items:
427
428    We should go through and throw out all controls that are no longer
429    connected to anything.  Moreover, we should go through and provide
430    meaningful (and correct!) documentation strings for those controls
431    that still are connected.
432
433    Currently, multiple calls to Controls.new are accepted (only the first
434    has any effect).  Eventually we should make sure that every control
435    is being made (via Controls.new) exactly once.  Future access can then
436    be done using Controls.acc.
437
438    Finally, it would probably be a good idea to use the getter-setter
439    interface to controls rather than ref cells.  For the time being, both
440    styles are provided by the Controls module, but getter-setter pairs are
441    better if thread-safety is of any concern because they can be wrapped.
442
443    *****************************************
444
445    One bug fix: The function blockPlacement in three of the MLRISC
446    backpatch files used to be hard-wired to one of two possibilities at
447    link time (according to the value of the placementFlag).  But (I
448    think) it should rather sense the flag every time.
449
450    *****************************************
451
452    Other assorted changes (by other people who did not supply a HISTORY entry):
453
454    1. the cross-module inliner now works much better (Monnier)
455    2. representation of weights, frequencies, and probabilities in MLRISC
456       changed in preparation of using those for weighted block placement
457       (Reppy, George)
458
459    ----------------------------------------------------------------------
460    Name: Lal George
461    Date: 2002/03/07 14:44:24 EST 2002
462    Tag: george-20020307-weighted-block-placement
463
464    Tested the weighted block placement optimization on all architectures
465    (except the hppa) using AMPL to generate the block and edge frequencies.
466    Changes were required in the machine properties to correctly
467    categorize trap instructions. There is an MLRISC flag
468    "weighted-block-placement" that can be used to enable weighted block
469    placement, but this will be ineffective without block/edge
470    frequencies (coming soon).
471
472
473    ----------------------------------------------------------------------
474    Name: Lal George
475    Date: 2002/03/05 17:24:48 EST
477
478    In order to support the block placement optimization, a new cluster
479    is generated as the very first cluster (called the linkage cluster).
480    It contains a single jump to the 'real' entry point for the compilation
481    unit. Block placement has no effect on the linkage cluster itself, but
482    all the other clusters  have full freedom in the manner in which they
483    reorder blocks or functions.
484
485    On the x86 the typical linkage code that is generated is:
486       ----------------------
487            .align 2
488       L0:
489            addl    \$L1-L0, 72(%esp)
490            jmp     L1
491
492
493            .align  2
494       L1:
495       ----------------------
496
497    72(%esp) is the memory location for the stdlink register. This
498    must contain the address of the CPS function being called. In the
499    above example, it contains the address of  L0; before
500    calling L1 (the real entry point for the compilation unit), it
501    must contain the address for L1, and hence
502
503            addl \$L1-L0, 72(%esp)
504
505    I have tested this on all architectures except the hppa.The increase
506    in code size is of course negligible
507
508    ----------------------------------------------------------------------
509    Name: Allen Leung
510    Date: 2002/03/03 13:20:00 EST
511    Tag: leunga-20020303-mlrisc-tools
512
513      Added #[ ... ] expressions to mlrisc tools
514
515    ----------------------------------------------------------------------
516    Name: Matthias Blume
517    Date: 2002/02/27 12:29:00 EST
518    Tag: blume-20020227-cdebug
519    Description:
520
521    - made types in structure C and C_Debug to be equal
522    - got rid of code duplication (c-int.sml vs. c-int-debug.sml)
523    - there no longer is a C_Int_Debug (C_Debug is directly derived from C)
524
525    ----------------------------------------------------------------------
526    Name: Matthias Blume
527    Date: 2002/02/26 12:00:00 EST
528    Tag: blume-20020226-ffi
529    Description:
530
531    1. Fixed a minor bug in CM's "noweb" tool:
532       If numbering is turned off, then truly don't number (i.e., do not
533       supply the -L option to noweb).  The previous behavior was to supply
534       -L'' -- which caused noweb to use the "default" line numbering scheme.
535       Thanks to Chris Richards for pointing this out (and supplying the fix).
536
537    2. Once again, I reworked some aspects of the FFI:
538
539       A. The incomplete/complete type business:
540
541       - Signatures POINTER_TO_INCOMPLETE_TYPE and accompanying functors are
542         gone!
543       - ML types representing an incomplete type are now *equal* to
544         ML types representing their corresponding complete types (just like
545         in C).  This is still safe because ml-nlffigen will not generate
546         RTTI for incomplete types, nor will it generate functions that
547         require access to such RTTI.   But when ML code generated from both
548         incomplete and complete versions of the C type meet, the ML types
549         are trivially interoperable.
550
551         NOTE:  These changes restore the full generality of the translation
552         (which was previously lost when I eliminated functorization)!
553
554       B. Enum types:
555
556       - Structure C now has a type constructor "enum" that is similar to
557         how the "su" constructor works.  However, "enum" is not a phantom
558         type because each "T enum" has values (and is isomorphic to
559         MLRep.Signed.int).
560       - There are generic access operations for enum objects (using
561         MLRep.Signed.int).
562       - ml-nlffigen will generate a structure E_foo for each "enum foo".
563         * The structure contains the definition of type "mlrep" (the ML-side
564         representation type of the enum).  Normally, mlrep is the same
565         as "MLRep.Signed.int", but if ml-nlffigen was invoked with "-ec",
566         then mlrep will be defined as a datatype -- thus facilitating
567         pattern matching on mlrep values.
568         ("-ec" will be suppressed if there are duplicate values in an
569          enumeration.)
570         * Constructors ("-ec") or values (no "-ec") e_xxx of type mlrep
571         will be generated for each C enum constant xxx.
572         * Conversion functions m2i and i2m convert between mlrep and
573         MLRep.Signed.int.  (Without "-ec", these functions are identities.)
574         * Coversion functions c and ml convert between mlrep and "tag enum".
575         * Access functions (get/set) fetch and store mlrep values.
576       - By default (unless ml-nlffigen was invoked with "-nocollect"), unnamed
577         enumerations are merged into one single enumeration represented by
578         structure E_'.
579
580    ----------------------------------------------------------------------
581    Name: Allen Leung
582    Date: 2002/02/25 04:45:00 EST
583    Tag: leunga-20020225-cps-spill
584
585    This is a new implementation of the CPS spill phase.
586    The new phase is in the new file compiler/CodeGen/cpscompile/spill-new.sml
587    In case of problems, replace it with the old file spill.sml
588
589    The current compiler runs into some serious performance problems when
590    constructing a large record.  This can happen when we try to compile a
591    structure with many items.  Even a very simple structure like the following
592    makes the compiler slow down.
593
594        structure Foo = struct
595           val x_1 = 0w1 : Word32.int
596           val x_2 = 0w2 : Word32.int
597           val x_3 = 0w3 : Word32.int
598           ...
599           val x_N = 0wN : Word32.int
600        end
601
602    The following table shows the compile time, from N=1000 to N=4000,
603    with the old compiler:
604
605    N
606    1000   CPS 100 spill                           0.04u  0.00s  0.00g
607           MLRISC ra                               0.06u  0.00s  0.05g
608              (spills = 0 reloads = 0)
609           TOTAL                                   0.63u  0.07s  0.21g
610
611    1100   CPS 100 spill                           8.25u  0.32s  0.64g
612           MLRISC ra                               5.68u  0.59s  3.93g
613              (spills = 0 reloads = 0)
614           TOTAL                                   14.71u  0.99s  4.81g
615
616    1500   CPS 100 spill                           58.55u  2.34s  1.74g
617           MLRISC ra                               5.54u  0.65s  3.91g
618              (spills = 543 reloads = 1082)
619           TOTAL                                   65.40u  3.13s  6.00g
620
621    2000   CPS 100 spill                           126.69u  4.84s  3.08g
622           MLRISC ra                               0.80u  0.10s  0.55g
623              (spills = 42 reloads = 84)
624           TOTAL                                   129.42u  5.10s  4.13g
625
626    3000   CPS 100 spill                           675.59u  19.03s  11.64g
627           MLRISC ra                               2.69u  0.27s  1.38g
628              (spills = 62 reloads = 124)
629           TOTAL                                   682.48u  19.61s  13.99g
630
631    4000   CPS 100 spill                           2362.82u  56.28s  43.60g
632           MLRISC ra                               4.96u  0.27s  2.72g
633              (spills = 85 reloads = 170)
634           TOTAL                                   2375.26u  57.21s  48.00g
635
636    As you can see the old cps spill module suffers from some serious
637    performance problem.  But since I cannot decipher the old code fully,
638    instead of patching the problems up, I'm reimplementing it
639    with a different algorithm.  The new code is more modular,
640    smaller when compiled, and substantially faster
641    (O(n log n) time and O(n) space).  Timing of the new spill module:
642
643    4000  CPS 100 spill                           0.02u  0.00s  0.00g
644          MLRISC ra                               0.25u  0.02s  0.15g
646          TOTAL                                   7.74u  0.34s  1.62g
647
648    Implementation details:
649
650    As far as I can tell, the purpose of the CPS spill module is to make sure the
651    number of live variables at any program point (the bandwidth)
652    does not exceed a certain limit, which is determined by the
653    size of the spill area.
654
655    When the bandwidth is too large, we decrease the register pressure by
656    packing live variables into spill records.  How we achieve this is
657    completely different than what we did in the old code.
658
659    First, there is something about the MLRiscGen code generator
660    that we should be aware of:
661
662    o MLRiscGen performs code motion!
663
664       In particular, it will move floating point computations and
665       address computations involving only the heap pointer to
666       their use sites (if there is only a single use).
667       What this means is that if we have a CPS record construction
668       statement
669
670           RECORD(k,vl,w,e)
671
672       we should never count the new record address w as live if w
673       has only one use (which is often the case).
674
675       We should do something similar to floating point, but the transformation
676       there is much more complex, so I won't deal with that.
677
678    Secondly, there are now two new cps primops at our disposal:
679
680     1. rawrecord of record_kind option
681        This pure operator allocates some uninitialized storage from the heap.
682        There are two forms:
683
684         rawrecord NONE [INT n]  allocates a tagless record of length n
685         rawrecord (SOME rk) [INT n] allocates a tagged record of length n
686                                     and initializes the tag.
687
688     2. rawupdate of cty
689          rawupdate cty (v,i,x)
690          Assigns to x to the ith component of record v.
691          The storelist is not updated.
692
693    We use these new primops for both spilling and increment record construction.
694
695     1. Spilling.
696
697        This is implemented with a linear scan algorithm (but generalized
698        to trees).  The algorithm will create a single spill record at the
699        beginning of the cps function and use rawupdate to spill to it,
700        and SELECT or SELp to reload from it.  So both spills and reloads
701        are fine-grain operations.  In contrast, in the old algorithm
702        "spills" have to be bundled together in records.
703
704        Ideally, we should sink the spill record construction to where
705        it is needed.  We can even split the spill record into multiple ones
706        at the places where they are needed.  But CPS is not a good
707        representation for global code motion, so I'll keep it simple and
708        am not attempting this.
709
710     2. Incremental record construction (aka record splitting).
711
712        Long records with many component values which are simulatenously live
713        (recall that single use record addresses are not considered to
714         be live) are constructed with rawrecord and rawupdate.
715        We allocate space on the heap with rawrecord first, then gradually
716        fill it in with rawupdate.  This is the technique suggested to me
717        by Matthias.
718
719        Some restrictions on when this is applicable:
720        1. It is not a VECTOR record.  The code generator currently does not handle
721           this case. VECTOR record uses double indirection like arrays.
722        2. All the record component values are defined in the same "basic block"
723           as the record constructor.  This is to prevent speculative
724           record construction.
725
726    ----------------------------------------------------------------------
727    Name: Allen Leung
728    Date: 2002/02/22 01:02:00 EST
729    Tag: leunga-20020222-mlrisc-tools
730
731    Minor bug fixes in the parser and rewriter
732
733    ----------------------------------------------------------------------
734    Name: Allen Leung
735    Date: 2002/02/21 20:20:00 EST
736    Tag: leunga-20020221-peephole
737
738    Regenerated the peephole files.  Some contained typos in the specification
739    and some didn't compile because of pretty printing bugs in the old version
740    of 'nowhere'.
741
742    ----------------------------------------------------------------------
743    Name: Allen Leung
744    Date: 2002/02/19 20:20:00 EST
745    Tag: leunga-20020219-mlrisc-tools
746    Description:
747
748       Minor bug fixes to the mlrisc-tools library:
749
750       1.  Fixed up parsing colon suffixed keywords
751       2.  Added the ability to shut the error messages up
752       3.  Reimplemented the pretty printer and fixed up/improved
753           the pretty printing of handle and -> types.
754       4.  Fixed up generation of literal symbols in the nowhere tool.
755       5.  Added some SML keywords to to sml.sty
756
757    ----------------------------------------------------------------------
758    Name: Matthias Blume
759    Date: 2002/02/19 16:20:00 EST
760    Tag: blume-20020219-cmffi
761    Description:
762
763    A wild mix of changes, some minor, some major:
764
765    * All C FFI-related libraries are now anchored under \$c:
766        \$/c.cm      --> \$c/c.cm
767        \$/c-int.cm  --> \$c/internals/c-int.cm
768        \$/memory.cm --> \$c/memory/memory.cm
769
770    * "make" tool (in CM) now treats its argument pathname slightly
771      differently:
772        1. If the native expansion is an absolute name, then before invoking
773           the "make" command on it, CM will apply OS.Path.mkRelative
774           (with relativeTo = OS.FileSys.getDir()) to it.
775        2. The argument will be passed through to subsequent phases of CM
776           processing without "going native".  In particular, if the argument
777           was an anchored path, then "make" will not lose track of that anchor.
778
779    * Compiler backends now "know" their respective C calling conventions
780      instead of having to be told about it by ml-nlffigen.  This relieves
781      ml-nlffigen from one of its burdens.
782
783    * The X86Backend has been split into X86CCallBackend and X86StdCallBackend.
784
785    * Export C_DEBUG and C_Debug from \$c/c.cm.
786
787    * C type encoding in ml-nlffi-lib has been improved to model the conceptual
788      subtyping relationship between incomplete pointers and their complete
789      counterparts.  For this, ('t, 'c) ptr has been changed to 'o ptr --
790      with the convention of instantiating 'o with ('t, 'c) obj whenever
791      the pointer target type is complete.  In the incomplete case, 'o
792      will be instantiated with some "'c iobj" -- a type obtained by
793      using one of the functors PointerToIncompleteType or PointerToCompleteType.
794
795      Operations that work on both incomplete and complete pointer types are
796      typed as taking an 'o ptr while operations that require the target to
797      be known are typed as taking some ('t, 'c) obj ptr.
798
799      voidptr is now a bit "more concrete", namely "type voidptr = void ptr'"
800      where void is an eqtype without any values.  This makes it possible
801      to work on voidptr values using functions meant to operate on light
802      incomplete pointers.
803
804    * As a result of the above, signature POINTER_TO_INCOMPLETE_TYPE has
805      been vastly simplified.
806
807    ----------------------------------------------------------------------
808    Name: Matthias Blume
809    Date: 2002/02/19 10:48:00 EST
810    Tag: blume-20020219-pqfix
811    Description:
812
813    Applied Chris Okasaki's bug fix for priority queues.
814
815    ----------------------------------------------------------------------
816    Name: Matthias Blume
817    Date: 2002/02/15 17:05:00
818    Tag: Release_110_39
819    Description:
820
821    Last-minute retagging is becoming a tradition... :-(
822
823    This is the working release 110.39.
824
825    ----------------------------------------------------------------------
826    Name: Matthias Blume
827    Date: 2002/02/15 16:00:00 EST
828    Tag: Release_110_39-orig
829    Description:
830
831    Working release 110.39.  New bootfiles.
832
833    (Update: There was a small bug in the installer so it wouldn't work
834    with all shells.  So I retagged. -Matthias)
835
836    ----------------------------------------------------------------------
837    Name: Matthias Blume
838    Date: 2002/02/15 14:17:00 EST
839    Tag: blume-20020215-showbindings
840    Description:
841
842    Added EnvRef.listBoundSymbols and CM.State.showBindings.  Especially
843    the latter can be useful for exploring what bindings are available at
844    the interactive prompt.  (The first function returns only the list
845    of symbols that are really bound, the second prints those but also the
847
848    ----------------------------------------------------------------------
849    Name: Matthias Blume
850    Date: 2002/02/15 12:08:00 EST
851    Tag: blume-20020215-iptrs
852    Description:
853
854    Two improvements to ml-nlffigen:
855
856      1. Write files only if they do not exist or if their current contents
857         do not coincide with what's being written.  (That is, avoid messing
858         with the time stamps unless absolutely necessary.)
859
860      2. Implement a "repository" mechanism for generated files related
861         to "incomplete pointer types".   See the README file for details.
862
863    ----------------------------------------------------------------------
864    Name: Matthias Blume
865    Date: 2002/02/14 11:50:00 EST
866    Tag: blume-20020214-quote
867    Description:
868
869    Added a type 't t_' to tag.sml (in ml-nlffi-lib.cm).  This is required
870    because of the new and improved tag generation scheme.  (Thanks to Allen
871    Leung for pointing it out.)
872
873    ----------------------------------------------------------------------
874    Name: Lal George
875    Date: 2002/02/14 09:55:27 EST 2002
876    Tag: george-20020214-isabelle-bug
877    Description:
878
879    Fixed the MLRISC bug sent by Markus Wenzel regarding the compilation
880    of Isabelle on the x86.
881
882    From Allen:
883    -----------
884     I've found the problem:
885
886         in ra-core.sml, I use the counter "blocked" to keep track of the
887         true number of elements in the freeze queue.  When the counter goes
888         to zero, I skip examining the queue.  But I've messed up the
889         bookkeeping in combine():
890
891             else ();
892             case !ucol of
893               PSEUDO => (if !cntv > 0 then
894                     (if !cntu > 0 then blocked := !blocked - 1 else ();
895                                        ^^^^^^^^^^^^^^^^^^^^^^^
896                      moveu := mergeMoveList(!movev, !moveu)
897                     )
898                  else ();
899
900         combine() is called to coalesce two nodes u and v.
901         I think I was thinking that if the move counts of u and v are both
902         greater than zero then after they are coalesced then one node is
903         removed from the freeze queue.  Apparently I was thinking that
904         both u and v are of low degree, but that's clearly not necessarily true.
905
906
907    02/12/2002:
908        Here's the patch.  HOL now compiles.
909
910        I don't know how this impact on performance (compile
911        time or runtime).  This bug caused the RA (especially on the x86)
912        to go thru the potential spill phase when there are still nodes on the
913        freeze queue.
914
915
916
917
918    ----------------------------------------------------------------------
919    Name: Matthias Blume
920    Date: 2002/02/13 22:40:00 EST
921    Tag: blume-20020213-fptr-rtti
922    Description:
923
924    Fixed a bug in ml-nlffigen that was introduced with one of the previous
926
927    ----------------------------------------------------------------------
928    Name: Matthias Blume
929    Date: 2002/02/13 16:41:00 EST
930    Tag: blume-20020213-cmlpq
931    Description:
932
933    Added new priority queue export symbols (which have just been added to
934    smlnj-lib.cm) to CML's version of smlnj-lib.cm.  (Otherwise CML would
935    not compile and the installer would choke.)
936
937    ----------------------------------------------------------------------
938    Name: Matthias Blume
939    Date: 2002/02/13 16:15:00 EST
940    Tag: blume-20020213-various
941    Description:
942
943    1. More tweaks to ml-nlffigen:
944
945       - better internal datastructures (resulting in slight speedup)
946       - "-match" option requires exact match
947       - "localized" gensym counters (untagged structs/unions nested within
948         other structs/unions or within typedefs get a fresh counter; their
949         tag will be prefixed by a concatenation of their parents' tags)
950       - bug fixes (related to calculation of transitive closure of types
951         to be included in the output)
952
953    2. Minor Basis updates:
954
955       - added implementations for List.collate and Option.app
956
957    ----------------------------------------------------------------------
958    Name: Matthias Blume
959    Date: 2002/02/11 15:55:00 EST
960    Tag: blume-20020211-gensym
961    Description:
962
963    Added a "-gensym" option to command line of ml-nlffigen.  This can be
964    used to specify a "stem" -- a string that is inserted in all "gensym'd"
965    names (ML structure names that correspond to unnamed C structs, unions,
966    and enums), so that separate runs of ml-nlffigen do not clash.
967
968    ----------------------------------------------------------------------
969    Name: Matthias Blume
970    Date: 2002/02/11 12:05:00 EST
971    Tag: blume-20020211-gensml
972    Description:
973
974    A quick fix for a problem with GenSML (in the pgraph-util library):
975    Make generation of toplevel "local" optional.  (Strictly speaking,
976    signature definitions within "local" are not legal SML.)
977
978    Other than that: updates to INSTALL and cm/TODO.
979
980    ----------------------------------------------------------------------
981    Name: Matthias Blume
982    Date: 2002/02/08 15:00:00 EST
983    Tag: blume-20020208-uniquepid
984    Description:
985
986    0. Version number has been bumped to 110.38.1.  NEW BOOTFILES!!!
987
988    1. The installer (config/install.sh) has gotten smarter:
989
990         - Configuration options are a bit easier to specify now
991           (in config/targets).
992         - Bug in recognizing .tar.bz2 files fixed.
993         - Installer automatically resolves dependencies between
994           configuration options (e.g., if you ask for eXene, you will
995           also get cml -- regardless whether you asked for it or not).
996         - Installer can run in "quieter mode" by setting the environment
997           variable INSTALL_QUIETLY to "true".  "Quieter" does not mean
998           "completely silent", though.
999         - Build HashCons library as part of smlnj-lib.
1000
1001    2. A new scheme for assigning persistent identifiers to compilation
1002       units (and, by extension, to types etc.) has been put into place.
1003       This fixes a long-standing bug where types and even dynamic values
1004       can get internally confused, thereby compromising type safety
1005       (abstraction) and dynamic correctness.  See
1006
1007         http://cm.bell-labs.com/cm/cs/who/blume/pid-confusion.tgz
1008
1009       for an example of how things could go wrong until now.
1010
1011       The downside of the new scheme is that pids are not quite as
1012       persistent as they used to be: CM will generate a fresh pid
1013       for every compilation unit that it thinks it sees for the first
1014       time.  That means that if you compile starting from a clean, fresh
1015       source tree at two different times, you end up with different
1016       binaries.
1017
1018       Cutoff recompilation, however, has not been compromised because
1019       CM keeps pid information in special caches between runs.
1020
1021    ----------------------------------------------------------------------
1022    Name: Lal George
1023    Date: 2002/02/07 15:34:13 EST 2002
1024    Tag: <none>
1025    Description:
1026
1027    Compilers that generate assembly code may produce  global labels
1028    whose value is resolved at link time. The various peephole optimization
1029    modules did not take this in account.
1030
1031    TODO. The Labels.addrOf function should really return an option
1032    type so that clients are forced to deal with this issue, rather
1033    than an exception being raised.
1034
1035    ----------------------------------------------------------------------
1036    Name: Lal George
1037    Date: 2002/02/06 13:55:02 EST
1038    Tag: george-20020206-ra-breakup
1039    Description:
1040
1041    1. A bug fix from Allen.
1042
1043        A typo causes extra fstp %st(0)'s to be generated at compensation
1044        edges, which might cause stack underflow traps at runtime.  This
1045        occurs in fft where there are extraneous fstps right before the 'into'
1046        trap instruction (in this case they are harmless since none of the
1047        integers overflow.)
1048
1049    2. Pulled out various utility modules that were embedded in the modules
1050       of the register allocator. I need these modules for other purposes, but
1051       they are not complete enough to put into a library (just yet).
1052    ----------------------------------------------------------------------
1053    Name: Matthias Blume
1054    Date: 2002/01/31 16:05:00 EST
1055    Tag: blume-20020131-sparc-ccalls
1056    Description:
1057
1058    1. C-calls on Sparc needlessly allocated a huge chunk (96 bytes)
1059       of extra stack space by mistake.  Fixed.
1060
1061    2. Bug in logic of handling of command-line options in ml-nlffigen fixed.
1062
1063    ----------------------------------------------------------------------
1064    Name: Allen Leung
1065    Date: 2002/01/30
1066    Tag: leunga-20020130-nowhere-bug-fix
1067    Description:
1068
1069       MLRISC bug fixes:
1070       1. Fixed a bindings computation bug in the 'nowhere' program generator tool.
1071       2. MachineInt.fromString was negating its value.
1072
1073    ----------------------------------------------------------------------
1074    Name: Matthias Blume
1075    Date: 2002/01/29
1076    Tag: blume-20020129-INSTALL
1077    Description:
1078
1079    - Added somewhat detailed installation instructions (file INSTALL).
1080    - Fixed curl-detection bug in config/install.sh.
1081    - It is now possible to select the URL getter using the URLGETTER
1082      environment variable:
1083
1084          not set / "unknown"      --> automatic detection (script tries wget,
1085                                       curl, and lynx)
1086          "wget" / "curl" / "lynx" --> use the specified program (script "knows"
1087                                       how to properly invoke them)
1088          other                    --> use \$URLGETTER directly, it must take
1089                                       precisely two command-line arguments
1090                                       (source URL and destination file name)
1091
1092    ----------------------------------------------------------------------
1093    Name: Matthias Blume
1094    Date: 2002/01/28
1095    Tag: blume-20020128-sparc-ccalls
1096    Description:
1097
1098    - Fixed problem with calculation of "used" registers in sparc-c-calls.
1099    - Make use of the allocParam argument in sparc-c-calls.
1100
1101    ----------------------------------------------------------------------
1102    Name: Matthias Blume
1103    Date: 2002/01/28
1104    Tag: blume-20020128-allocParam
1105    Description:
1106
1107    John Reppy:  Changes c-calls API to accept client-callback for
1108    allocating extra stack space.
1109    me: Corresponding changes to mlriscGen (using a dummy argument that
1110        does not change the current behavior).
1111
1112    ----------------------------------------------------------------------
1113    Name: Matthias Blume
1114    Date: 2002/01/28 12:00:00
1115    Tag: Release_110_38
1116    Description:
1117
1118    This time for real!!!
1119
1120    ----------------------------------------------------------------------
1121    Name: Matthias Blume
1122    Date: 2002/01/28 10:56:00 EST
1123    Tag: blume-20020128-retraction
1124    Description:
1125
1126    0. Retracted earlier 110.38.  (The Release_110_38 tag has been replaced
1127       with blume-Release_110_38-retracted.)
1128
1129    1. Fixed a problem with incorrect rounding modes in real64.sml.
1130       (Thanks to Andrew Mccreight <andrew.mccreight@yale.edu>.)
1131
1132    2. A bug in ml-nlffigen related to the handling of unnamed structs, unions,
1133       and enums fixed.  The naming of corresponding ML identifiers should
1134       now be consistent again.
1135
1136    ----------------------------------------------------------------------
1137    Name: Allen Leung
1138    Date: 2002/01/27
1139    Tag: leunga-20020127-nowhere
1140    Description:
1141
1142       Added a target called nowhere in the configuration scripts.
1143       Enabling this will build the MLRISC 'nowhere' tool (for translating
1144       programs with where-clauses into legal SML code) during installation.
1145
1146    ----------------------------------------------------------------------
1147    Name: Matthias Blume
1148    Date: 2002/01/25 21:27:00 EST
1149    Tag: blume-Release_110_38-retracted
1150    Description:
1151
1152    Call it a (working) release!  Version is 110.38. Bootfiles are ready.
1153
1155
1156    !!! NOTE:  Re-tagged as blume-Release_110_38-retracted. Original tag
1157    (Release_110_38) removed.  Reason: Last-minute bug fixes.
1158
1159    ----------------------------------------------------------------------
1160    Name: Matthias Blume
1161    Date: 2002/01/25
1162    Tag: blume-20020125-ffi
1163    Description:
1164
1165    A large number of tweaks and improvements to ml-nlffi-lib and
1166    ml-nlffigen:
1167
1168       - ML represenation types have been streamlined
1169       - getter and setter functions work with concrete values, not abstract
1170         ones where possible
1171       - ml-nlffigen command line more flexible (see README file there)
1172       - some bugs have been fixed (hopefully)
1173
1174    ----------------------------------------------------------------------
1175    Name: Lal George
1176    Date: 2002/01/24
1177    Tag: george-20020124-risc-ra-interface
1178    Description:
1179
1180       There is a dramatic simplification in the interface to the
1181       register allocator for RISC architectures as a result of making
1182       parallel copy instructions explicit.
1183
1184    ----------------------------------------------------------------------
1185    Name: Matthias Blume
1186    Date: 2002/01/22
1187    Tag: blume-20020122-x86-ccalls
1188    Description:
1189
1190    Bug fix for c-calls on x86 (having to do with how char- and
1191    short-arguments are being handled).
1192
1193    ----------------------------------------------------------------------
1194    Name: Matthias Blume
1195    Date: 2002/01/21
1196    Tag: blume-20020121-ff
1197    Description:
1198
1199    Another day of fiddling with the FFI...
1200
1201    1. Bug fix/workaround:  CKIT does not complain about negative array
1202       dimensions, so ml-nlffigen has to guard itself against this possibility.
1203       (Otherwise a negative dimension would send it into an infinite loop.)
1204
1205    2. Some of the abstract types (light objects, light pointers, most "base"
1206       types) in structure C are now eqtypes.
1207
1208    3. Added constructors and test functions for NULL function pointers.
1209
1210    ----------------------------------------------------------------------
1211    Name: Matthias Blume
1212    Date: 2002/01/18
1214    Description:
1215
1216    Made config/srcarchiveurl point to a new place.  (Will provide boot
1217    files shortly.)
1218
1219    Maybe we christen this to be 110.38?
1220
1221    ----------------------------------------------------------------------
1222    Name: Matthias Blume
1223    Date: 2002/01/18
1224    Tag: blume-20020118-more-ffifiddle
1225    Description:
1226
1227    Today's FFI fiddling:
1228
1229      - Provided a structure CGetSet with "convenient" versions of C.Get.* and
1230        C.Set.* that use concrete (MLRep.*) arguments and results instead
1231        of abstract ones.
1232
1233      - Provided word-style bit operations etc. for "int" representation
1234        types in MLRep.S<Foo>Bitops where <Foo> ranges over Char, Int, Short,
1235        and Long.
1236
1237    ----------------------------------------------------------------------
1238    Name: Matthias Blume
1239    Date: 2002/01/18
1240    Tag: blume-20020118-use-x86-fp
1241    Description:
1242
1243    Now that x86-fast-fp seems to be working, I turned it back on again
1244    by default.  (Seems to work fine now, even with the FFI.)
1245
1246    Other than that, I added some documentation about the FFI to
1247    src/ml-nlffigen/README and updated the FFI test examples in
1248    src/ml-nlffi-lib/Tests/*.
1249
1250    ----------------------------------------------------------------------
1251    Name: Allen Leung
1252    Date: 2002/01/17
1253    Tag: leunga-20020117-x86-fast-fp-call
1254    Description:
1255
1256       1. Fixed a problem with handling return fp values when x86's fast fp
1257          mode is turned on.
1258
1259       2. Minor pretty printing fix for cellset.  Print %st(0) as %st(0) instead
1260          of %f32.
1261
1262       3. Added a constructor INT32lit to the ast of MLRISC tools.
1263
1264    ----------------------------------------------------------------------
1265    Name: Matthias Blume
1266    Date: 2002/01/16
1267    Tag: blume-20020116-ffifiddle
1268    Description:
1269
1270    More fiddling with the FFI interface:
1271
1272     - Make constness 'c instead of rw wherever possible.  This eliminates
1273       the need for certain explicit coercions.  (However, due to ML's
1274       value polymorphism, there will still be many cases where explicit
1275       coercions are necessary.  Phantom types are not the whole answer
1276       to modeling a subtyping relationship in ML.)
1277
1278     - ro/rw coersions for pointers added.  (Avoids the detour through */&.)
1279
1280     - "printf" test example added to src/ml-nlffi-lib/Tests.  (Demonstrates
1281       clumsy workaround for varargs problem.)
1282
1283    ----------------------------------------------------------------------
1284    Name: Lal George
1285    Date: 2002/01/15
1286    Tag: <none>
1287    Description:
1288
1289    1. Since COPY instructions are no longer native to the architecture,
1290       a generic functor can be used to implement the expandCopies function.
1291
1292    2. Allowed EXPORT and IMPORT pseudo-op declarations to appear inside a
1293       TEXT segment.
1294
1295    ----------------------------------------------------------------------
1296    Name: Matthias Blume
1297    Date: 2002/01/15
1299    Description:
1300
1301    1. Fix for bug resulting in single-precision float values being returned
1302       incorrectly from FFI calls.
1303
1304    2. Small modifications to C FFI API:
1305
1306        - memory-allocation routines return straight objects (no options)
1307          and raise an exception in out-of-memory situations
1308        - unsafe extensions to cast between function pointers and pointers
1309          from/to ints
1310        - added structure C_Debug as an alternative to structure C where
1311          pointer-dereferencing (|*| and |*!) always check for null-pointers
1312        - added open_lib' to DynLinkage;  open_lib' works like open_lib
1313          but also takes a (possibly empty) list of existing library handles
1314          that the current library depends on
1315
1316    ----------------------------------------------------------------------
1317    Name: Matthias Blume
1318    Date: 2002/01/10
1319    Tag: blume-20020110-newffigen
1320    Description:
1321
1322    1. Updates to portable graph code.
1323
1324    2. Major update to ml-nlffigen and ml-nlffi-lib.  Things are much
1325       more scalable now so that even huge interfaces such as the one
1326       for GTK compile in finite time and space. :-)
1327       See src/ml-nlffigen/README for details on what's new.
1328
1329    ----------------------------------------------------------------------
1330    Name: Lal George
1331    Date: 2001/01/09 14:31:35 EST 2002
1332    Tag: george-20011206-rm-native-copy
1333    Description:
1334
1335            Removed the native COPY and FCOPY instructions
1336            from all the architectures and replaced it with the
1337            explicit COPY instruction from the previous commit.
1338
1339            It is now possible to simplify many of the optimizations
1340            modules that manipulate copies. This has not been
1341            done in this change.
1342
1343  ----------------------------------------------------------------------  ----------------------------------------------------------------------
1344  Name: Lal George  Name: Lal George
1345  Date: 2001/12/06 16:50:13 EST 2001  Date: 2001/12/06 16:50:13 EST 2001
# Line 341  Line 1670
1670
1671  ----------------------------------------------------------------------  ----------------------------------------------------------------------
1672  Name: Matthias Blume  Name: Matthias Blume
>>>>>>> 1.169
1673  Date: 2001/09/18 15:35:00 EDT  Date: 2001/09/18 15:35:00 EDT