Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 972, Sun Oct 28 00:37:02 2001 UTC revision 1185, Mon Apr 1 22:06:47 2002 UTC
# Line 8  Line 8 
8  The form of an entry should be:  The form of an entry should be:
9    
10  Name:  Name:
11  Date:  Date: yyyy/mm/dd
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14    
15  ----------------------------------------------------------------------  ----------------------------------------------------------------------
16    Name: Matthias Blume
17    Date: 2002/04/01 (no joke!) 17:07:00 EST
18    Tag: blume-20020401-x86div
19    Description:
20    
21    Added full support for div/mod/rem/quot on the x86, using the machine
22    instruction's two results (without clumsily recomputing the remainder)
23    directly where appropriate.
24    
25    Some more extensive power-of-two support was added to the x86 instruction
26    selector (avoiding expensive divs, mods, and muls where they can be
27    replaced with cheaper shifts and masks).  However, this sort of thing
28    ought to be done earlier, e.g., within the CPS optimizer so that
29    all architectures benefit from it.
30    
31    The compiler compiles to a fixed point, but changes might be somewhat
32    fragile nevertheless.  Please, report any strange things that you might
33    see wrt. div/mod/quot/rem...
34    
35    ----------------------------------------------------------------------
36    Name: Matthias Blume
37    Date: 2002/03/29 17:22:00
38    Tag: blume-20020329-div
39    Description:
40    
41    Fixed my broken div/mod logic.  Unfortunately, this means that the
42    inline code for div/mod now has one more comparison than before.
43    Fast paths (quotient > 0 or remainder = 0) are not affected, though.
44    The problem was with quotient = 0, because that alone does not tell
45    us which way the rounding went.  One then has to look at whether
46    remainder and divisor have the same sign...  :(
47    
48    Anyway, I replaced the bootfiles with fresh ones...
49    
50    ----------------------------------------------------------------------
51    Name: Matthias Blume
52    Date: 2002/03/29 14:10:00 EST
53    Tag: blume-20020329-inlprims
54    Description:
55    
56    NEW BOOTFILES!!!    Version number bumped to 110.39.3.
57    
58    Primops have changed. This means that the bin/boot-file formats have
59    changed as well.
60    
61    To make sure that there is no confusion, I made a new version.
62    
63    
64    CHANGES:
65    
66    * removed REMT from mltree (remainder should never overflow).
67    
68    * added primops to deal with divisions of all flavors to the frontend
69    
70    * handled these primops all the way through so they map to their respective
71      MLRISC support
72    
73    * used these primops in the implementation of Int, Int32, Word, Word32
74    
75    * removed INLDIV, INLMOD, and INLREM as they are no longer necessary
76    
77    * parameterized INLMIN, INLMAX, and INLABS by a numkind
78    
79    * translate.sml now deals with all flavors of INL{MIN,MAX,ABS}, including
80      floating point
81    
82    * used INL{MIN,MAX,ABS} in the implementation of Int, Int32, Word, Word32,
83      and Real (but Real.abs maps to a separate floating-point-only primop)
84    
85    
86    TODO items:
87    
88    * Hacked Alpha32 instruction selection, disabling the selection of REMx
89      instructions because the machine instruction encoder cannot handle
90      them.  (Hppa, PPC, and Sparc instruction selection did not handle
91      REM in the first place, and REM is supported by the x86 machine coder.)
92    
93    * Handle DIV and MOD with DIV_TO_NEGINF directly in the x86 instruction
94      selection phase.  (The two can be streamlined because the hardware
95      delivers both quotient and remainder at the same time anyway.)
96    
97    * Think about what to do with "valOf(Int32.minInt) div ~1" and friends.
98      (Currently the behavior is inconsistent both across architectures and
99      wrt. the draft Basis spec.)
100    
101    * Word8 should eventually be handled natively, too.
102    
103    * There seems to be one serious bug in mltree-gen.sml.  It appears, though,
104      as if there currently is no execution path that could trigger it in
105      SML/NJ.  (The assumptions underlying functions arith and promotable do not
106      hold for things like multiplication and division.)
107    
108    ----------------------------------------------------------------------
109    Name: Matthias Blume
110    Date: 2002/03/27 16:27:00 EST
111    Tag: blume-20020327-mlrisc-divisions
112    Description:
113    
114    Added support for all four division operations (ML's div, mod, quot,
115    and rem) to MLRISC.  In the course of doing so, I also rationalized
116    the naming (no more annoying switch-around of DIV and QUOT), by
117    parameterizing the operation by div_rounding_mode (which can be either
118    DIV_TO_ZERO or DIV_TO_NEGINF).
119    
120    The generic MLTreeGen functor takes care of compiling all four
121    operations down to only round-to-zero div.
122    
123    Missing pieces:
124    
125      * Doing something smarter than relying on MLTreeGen on architectures
126        like, e.g., the x86 where hardware division delivers both quotient and
127        remainder at the same time.  With this, the implementation of the
128        round-to-neginf operations could be further streamlined.
129    
130      * Remove inlining support for div/mod/rem from the frontend and replace it
131        with primops that get carried through to the backend.  Do this for all
132        int and word types.
133    
134    ----------------------------------------------------------------------
135    Name: Matthias Blume
136    Date: 2002/03/25 17:25:00 EST
137    Tag: blume-20020325-divmod
138    Description:
139    
140    I improved (hopefully without breaking them) the implementation of Int.div,
141    Int.mod, and Int.rem.   For this, the code in translate.sml now takes
142    advantage of the following observations:
143    
144      Let  q = x quot y      r = x rem y
145           d = x div  y      m = x mod y
146    
147    where "quot" is the round-to-zero version of integer division that
148    hardware usually provides.  Then we have:
149    
150         r = x - q * y        where neither the * nor the - will overflow
151         d = if q >= 0 orelse x = q * y then q else q - 1
152                              where neither the * nor the - will overflow
153         m = if q >= 0 orelse r = 0 then r else r + y
154                              where the + will not overflow
155    
156    This results in substantial simplification of the generated code.
157    The following table shows the number of CFG nodes and edges generated
158    for
159            fun f (x, y) = x OPER y
160            (* with OPER \in div, mod, quot, rem *)
161    
162    
163        OPER | nodes(old) | edges(old) | nodes(new) | edges(new)
164        --------------------------------------------------------
165         div |         24 |         39 |         12 |         16
166         mod |         41 |         71 |         12 |         16
167        quot |          8 |         10 |          8 |         10
168         rem |         10 |         14 |          8 |         10
169    
170    
171    ----------------------------------------------------------------------
172    Name: Matthias Blume
173    Date: 2002/03/25 22:06:00 EST
174    Tag: blume-20020325-cprotobug
175    Description:
176    
177    Fixed a bug in cproto (c prototype decoder).
178    
179    ----------------------------------------------------------------------
180    Name: Matthias Blume
181    Date: 2002/03/25 16:00:00 EST
182    Tag: blume-20020325-raw-primops
183    Description:
184    
185    I did some cleanup to Allen's new primop code and
186    replaced yesterday's bootfiles with new ones.
187    (But they are stored in the same place.)
188    
189    ----------------------------------------------------------------------
190    Name: Matthias Blume
191    Date: 2002/03/24 22:40:00 EST
192    Tag: blume-20020324-bootfiles
193    Description:
194    
195    Made the bootfiles that Allen asked for.
196    
197    ----------------------------------------------------------------------
198    Name: Allen Leung
199    Date: 2002/03/23 15:50:00 EST
200    Tag: leunga-20020323-flint-cps-rcc-primops
201    Description:
202    
203      1. Changes to FLINT primops:
204    
205        (* make a call to a C-function;
206         * The primop carries C function prototype information and specifies
207         * which of its (ML-) arguments are floating point. C prototype
208         * information is for use by the backend, ML information is for
209         * use by the CPS converter. *)
210      | RAW_CCALL of { c_proto: CTypes.c_proto,
211                       ml_args: ccall_type list,
212                       ml_res_opt: ccall_type option,
213                       reentrant : bool
214                     } option
215       (* Allocate uninitialized storage on the heap.
216        * The record is meant to hold short-lived C objects, i.e., they
217        * are not ML pointers.  With the tag, the representation is
218        * the same as RECORD with tag tag_raw32 (sz=4), or tag_fblock (sz=8)
219        *)
220      | RAW_RECORD of {tag:bool,sz:int}
221      and ccall_type = CCALL_INT32 | CCALL_REAL64 | CCALL_ML_PTR
222    
223      2.  These CPS primops are now overloaded:
224    
225           rawload of {kind:numkind}
226           rawstore of {kind:numkind}
227    
228          The one argument form is:
229    
230             rawload {kind} address
231    
232          The two argument form is:
233    
234             rawload {kind} [ml object, byte-offset]
235    
236      3. RAW_CCALL/RCC now takes two extra arguments:
237    
238         a. The first is whether the C call is reentrant, i.e., whether
239            ML state should be saved and restored.
240         b. The second argument is a string argument specifying the name of
241            library and the C function.
242    
243         These things are currently not handled in the code generator, yet.
244    
245      4. In CProto,
246    
247         An encoding type of "bool" means "ml object" and is mapped into
248         C prototype of PTR.  Note that "bool" is different than "string",
249         even though "string" is also mapped into PTR, because "bool"
250         is assigned an CPS type of BOGt, while "string" is assigned INT32t.
251    
252      5. Pickler/unpicker
253    
254         Changed to handle RAW_RECORD and newest RAW_CCALL
255    
256      6. MLRiscGen,
257    
258         1. Changed to handle the new rawload/rawstore/rawrecord operators.
259         2. Code for handling C Calls has been moved to a new module CPSCCalls,
260            in the file CodeGen/cpscompile/cps-c-calls.sml
261    
262      7. Added the conditional move operator
263    
264             condmove of branch
265    
266         to cps.  Generation of this is still buggy so it is currently
267         disabled.
268    
269    ----------------------------------------------------------------------
270    Name: Lal George
271    Date: 2002/03/22 14:18:25 EST
272    Tag: george-20020322-cps-branch-prob
273    Description:
274    
275    Implemented the Ball-Larus branch prediction-heuristics, and
276    incorporated graphical viewers for control flow graphs.
277    
278    Ball-Larus Heuristics:
279    ---------------------
280    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
281    
282    By design it uses the Dempster-Shafer theory for combining
283    probabilities.  For example, in the function:
284    
285        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
286    
287    the ball-larus heuristics predicts that the n=0 is unlikely
288    (OH-heuristic), and the 'then' branch is unlikely because of the
289    RH-heuristic -- giving the 'then' branch an even lower combined
290    probability using the Dempster-Shafer theory.
291    
292    Finally, John Reppy's loop analysis in MLRISC, further lowers the
293    probability of the 'then' branch because of the loop in the else
294    branch.
295    
296    
297    Graphical Viewing:
298    ------------------
299    I merely plugged in Allen's graphical viewers into the compiler. The
300    additional code is not much. At the top level, saying:
301    
302            Control.MLRISC.getFlag "cfg-graphical-view" := true;
303    
304    will display the graphical view of the control flow graph just before
305    back-patching.  daVinci must be in your path for this to work. If
306    daVinci is not available, then the default viewer can be changed
307    using:
308    
309            Control.MLRISC.getString "viewer"
310    
311    which can be set to "dot" or "vcg" for the corresponding viewers. Of
312    course, these viewers must be in your path.
313    
314    The above will display the compilation unit at the level of clusters,
315    many of which are small, boring, and un-interesting. Also setting:
316    
317            Control.MLRISC.getInt "cfg-graphical-view_size"
318    
319    will display clusters that are larger than the value set by the above.
320    
321    
322    ----------------------------------------------------------------------
323    Name: Matthias Blume
324    Date: 2002/03/21 22:20:00 EST
325    Tag: blume-20020321-kmp-bugfix
326    Description:
327    
328    Changed the interface to the KMP routine in PreString and fixed
329    a minor bug in one place where it was used.
330    
331    ----------------------------------------------------------------------
332    Name: Allen Leung
333    Date: 2002/03/21 20:30:00 EST
334    Tag: leunga-20020321-cfg
335    Description:
336    
337      Fixed a potential problem in cfg edge splitting.
338    
339    ----------------------------------------------------------------------
340    Name: Allen Leung
341    Date: 2002/03/21 17:15:00 EST
342    Tag: leunga-20020321-x86-fp-cfg
343    Description:
344    
345      1. Recoded the buggy parts of x86-fp.
346    
347         a. All the block reordering code has been removed.
348            We now depend on the block placement phases to do this work.
349    
350         b. Critical edge splitting code has been simplified and moved into the
351            CFG modules, as where they belong.
352    
353         Both of these were quite buggy and complex.  The code is now much, much
354         simpler.
355    
356      2. X86 backend.
357    
358         a. Added instructions for 64-bit support.  Instruction selection for
359            64-bit has not been committed, however, since that
360            requires changes to MLTREE which haven't been approved by
361            Lal and John.
362    
363         b. Added support for FUCOMI and FUCOMIP when generating code for
364            PentiumPro and above.  We only generate these instructions in
365            the fast-fp mode.
366    
367         c. Added cases for JP and JNP in X86FreqProps.
368    
369      3. CFG
370    
371         CFG now has a bunch of methods for edge splitting and merging.
372    
373      4. Machine description.
374    
375         John's simplification of MLTREE_BASIS.fcond broke a few machine
376         description things:
377    
378         rtl-build.{sig,sml} and hppa.mdl fixed.
379    
380         NOTE: the machine description stuff in the repository is still broken.
381               Again, I can't put my fixes in because that involves
382               changes to MLTREE.
383    
384    ----------------------------------------------------------------------
385    Name: Matthias Blume
386    Date: 2002/03/20 15:55:00 EST
387    Tag: blume-20020320-kmp
388    Description:
389    
390    Implemented Knuth-Morris-Pratt string matching in PreString and used
391    it for String.isSubstring, Substring.isSubstring, and
392    Substring.position.
393    
394    (Might need some stress-testing.  Simple examples worked fine.)
395    
396    ----------------------------------------------------------------------
397    Name: Matthias Blume
398    Date: 2002/03/19 16:37:00 EST
399    Tag: blume-20020319-witnesses
400    Description:
401    
402    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
403    
404    This implements a generic mechanism for changing constness qualifiers
405    anywhere within big C types without resorting to outright "casts".
406    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
407    modify the constness at the outermost level.)
408    The implementation of "convert" is based on the idea of "witness"
409    values -- values that are not used by the operation but whose types
410    "testify" to their applicability.  On the implementation side, "convert"
411    is simply a projection (returning its second curried argument).  With
412    cross-module inlining, it should not result in any machine code being
413    generated.
414    
415    ----------------------------------------------------------------------
416    Name: Matthias Blume
417    Date: 2002/03/15 16:40:00 EST
418    Tag: blume-20020315-basis
419    Description:
420    
421    Provided (preliminary?) implementations for
422    
423      {String,Substring}.{concatWith,isSuffix,isSubstring}
424    
425    and
426    
427      Substring.full
428    
429    Those are in the Basis spec but they were missing in SML/NJ.
430    
431    ----------------------------------------------------------------------
432    Name: Matthias Blume
433    Date: 2002/03/14 21:30:00 EST
434    Tag: blume-20020314-controls
435    Description:
436    
437    Controls:
438    ---------
439    
440    1. Factored out the recently-added Controls : CONTROLS stuff and put
441       it into its own library $/controls-lib.cm.  The source tree for
442       this is under src/smlnj-lib/Controls.
443    
444    2. Changed the names of types and functions in this interface, so they
445       make a bit more "sense":
446    
447          module -> registry
448          'a registry -> 'a group
449    
450    3. The interface now deals in ref cells only.  The getter/setter interface
451       is (mostly) gone.
452    
453    4. Added a function that lets one register an already-existing ref cell.
454    
455    5. Made the corresponding modifications to the rest of the code so that
456       everything compiles again.
457    
458    6. Changed the implementation of Controls.MLRISC back to something closer
459       to the original.  In particular, this module (and therefore MLRISC)
460       does not depend on Controls.  There now is some link-time code in
461       int-sys.sml that registers the MLRISC controls with the Controls
462       module.
463    
464    CM:
465    ---
466    
467      * One can now specify the lambda-split aggressiveness in init.cmi.
468    
469    ----------------------------------------------------------------------
470    Name: Allen Leung
471    Date: 2002/03/13 17:30:00 EST
472    Tag: leunga-20020313-x86-fp-unary
473    Description:
474    
475    Bug fix for:
476    
477    > leunga@weaselbane:~/Yale/tmp/sml-dist{21} bin/sml
478    > Standard ML of New Jersey v110.39.1 [FLINT v1.5], March 08, 2002
479    > - fun f(x,(y,z)) = Real.~ y;
480    > [autoloading]
481    > [autoloading done]
482    >       fchsl   (%eax), 184(%esp)
483    > Error: MLRisc bug: X86MCEmitter.emitInstr
484    >
485    > uncaught exception Error
486    >   raised at: ../MLRISC/control/mlriscErrormsg.sml:16.14-16.19
487    
488    The problem was that the code generator did not generate any fp registers
489    in this case, and the ra didn't know that it needed to run the X86FP phase to
490    translate the pseudo fp instruction.   This only happened with unary fp
491    operators in certain situations.
492    
493    ----------------------------------------------------------------------
494    Name: Matthias Blume
495    Date: 2002/03/13 14:00:00 EST
496    Tag: blume-20020313-overload-etc
497    Description:
498    
499    1. Added _overload as a synonym for overload for backward compatibility.
500       (Control.overloadKW must be true for either version to be accepted.)
501    
502    2. Fixed bug in install script that caused more things to be installed
503       than what was requested in config/targets.
504    
505    3. Made CM aware of the (_)overload construct so that autoloading
506       works.
507    
508    ----------------------------------------------------------------------
509    Name: Matthias Blume
510    Date: 2002/03/12 22:03:00 EST
511    Tag: blume-20020312-url
512    Description:
513    
514    Forgot to update BOOT and srcarchiveurl.
515    
516    ----------------------------------------------------------------------
517    Name: Matthias Blume
518    Date: 2002/03/12 17:30:00 EST
519    Tag: blume-20020312-version110392
520    Description:
521    
522    Yet another version number bump (because of small changes to the
523    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
524    
525    Changes:
526    
527      The new pid generation scheme described a few weeks ago was overly
528      complicated.  I implemented a new mechanism that is simpler and
529      provides a bit more "stability":  Once CM has seen a compilation
530      unit, it keeps its identity constant (as long as you do not delete
531      those crucial CM/GUID/* files).  This means that when you change
532      an interface, compile, then go back to the old interface, and
533      compile again, you arrive at the original pid.
534    
535      There now also is a mechanism that instructs CM to use the plain
536      environment hash as a module's pid (effectively making its GUID
537      the empty string).  For this, "noguid" must be specified as an
538      option to the .sml file in question within its .cm file.
539      This is most useful for code that is being generated by tools such
540      as ml-nlffigen (because during development programmers tend to
541      erase the tool's entire output directory tree including CM's cached
542      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
543      revert to the old, broken behavior of SML/NJ, but in specific cases
544      where there is no danger of interface confusion, its use is ok
545      (I think).
546    
547      ml-nlffigen by default generates "noguid" annotations.  They can be
548      turned off by specifying -guid in its command line.
549    
550    ----------------------------------------------------------------------
551    Name: Lal George
552    Date: 2002/03/12 12 14:42:36 EST
553    Tag: george-20020312-frequency-computation
554    Description:
555    
556    Integrated jump chaining and static block frequency into the
557    compiler. More details and numbers later.
558    
559    ----------------------------------------------------------------------
560    Name: Lal George
561    Date: 2002/03/11 11 22:38:53 EST
562    Tag: george-20020311-jump-chain-elim
563    Description:
564    
565    Tested the jump chain elimination on all architectures (except the
566    hppa).  This is on by default right now and is profitable for the
567    alpha and x86, however, it may not be profitable for the sparc and ppc
568    when compiling the compiler.
569    
570    The gc test will typically jump to a label at the end of the cluster,
571    where there is another jump to an external cluster containing the actual
572    code to invoke gc. This is to allow factoring of common gc invocation
573    sequences. That is to say, we generate:
574    
575            f:
576               testgc
577               ja   L1      % jump if above to L1
578    
579            L1:
580               jmp L2
581    
582    
583    After jump chain elimination the 'ja L1' instructions is converted to
584    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
585    up being implemented in their long form (if L2 is far away) using:
586    
587            jbe     L3      % jump if below or equal to L3
588            jmp     L2
589         L3:
590            ...
591    
592    
593    For large compilation units L2  may be far away.
594    
595    
596    ----------------------------------------------------------------------
597    Name: Matthias Blume
598    Date: 2002/03/11 13:30:00 EST
599    Tag: blume-20020311-mltreeeval
600    Description:
601    
602    A functor parameter was missing.
603    
604    ----------------------------------------------------------------------
605    Name: Allen Leung
606    Date: 2002/03/11 10:30:00 EST
607    Tag: leunga-20020311-runtime-string0
608    Description:
609    
610       The representation of the empty string now points to a
611    legal null terminated C string instead of unit.  It is now possible
612    to convert an ML string into C string with InlineT.CharVector.getData.
613    This compiles into one single machine instruction.
614    
615    ----------------------------------------------------------------------
616    Name: Allen Leung
617    Date: 2002/03/10 23:55:00 EST
618    Tag: leunga-20020310-x86-call
619    Description:
620    
621       Added machine generation for CALL instruction (relative displacement mode)
622    
623    ----------------------------------------------------------------------
624    Name: Matthias Blume
625    Date: 2002/03/08 16:05:00
626    Tag: blume-20020308-entrypoints
627    Description:
628    
629    Version number bumped to 110.39.1.  NEW BOOTFILES!
630    
631    Entrypoints: non-zero offset into a code object where execution should begin.
632    
633    - Added the notion of an entrypoint to CodeObj.
634    - Added reading/writing of entrypoint info to Binfile.
635    - Made runtime system bootloader aware of entrypoints.
636    - Use the address of the label of the first function given to mlriscGen
637      as the entrypoint.  This address is currently always 0, but it will
638      not be 0 once we turn on block placement.
639    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
640      with entry points) from mlriscGen.
641    
642    ----------------------------------------------------------------------
643    Name: Allen Leung
644    Date: 2002/03/07 20:45:00 EST
645    Tag: leunga-20020307-x86-cmov
646    Description:
647    
648       Bug fixes for CMOVcc on x86.
649    
650       1. Added machine code generation for CMOVcc
651       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
652       3. CMOVcc cannot have an immediate operand as argument.
653    
654    ----------------------------------------------------------------------
655    Name: Matthias Blume
656    Date: 2002/03/07 16:15:00 EST
657    Tag: blume-20020307-controls
658    Description:
659    
660    This is a very large but mostly boring patch which makes (almost)
661    every tuneable compiler knob (i.e., pretty much everything under
662    Control.* plus a few other things) configurable via both the command
663    line and environment variables in the style CM did its configuration
664    until now.
665    
666    Try starting sml with '-h' (or, if you are brave, '-H')
667    
668    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
669    implements the underlying generic mechanism.
670    
671    The interface to some of the existing such facilities has changed somewhat.
672    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
673    (The getFoo interface is still there for backward-compatibility, but its
674    use is deprecated.)
675    
676    The ml-build script passes -Cxxx=yyy command-line arguments through so
677    that one can now twiddle the compiler settings when using this "batch"
678    compiler.
679    
680    TODO items:
681    
682    We should go through and throw out all controls that are no longer
683    connected to anything.  Moreover, we should go through and provide
684    meaningful (and correct!) documentation strings for those controls
685    that still are connected.
686    
687    Currently, multiple calls to Controls.new are accepted (only the first
688    has any effect).  Eventually we should make sure that every control
689    is being made (via Controls.new) exactly once.  Future access can then
690    be done using Controls.acc.
691    
692    Finally, it would probably be a good idea to use the getter-setter
693    interface to controls rather than ref cells.  For the time being, both
694    styles are provided by the Controls module, but getter-setter pairs are
695    better if thread-safety is of any concern because they can be wrapped.
696    
697    *****************************************
698    
699    One bug fix: The function blockPlacement in three of the MLRISC
700    backpatch files used to be hard-wired to one of two possibilities at
701    link time (according to the value of the placementFlag).  But (I
702    think) it should rather sense the flag every time.
703    
704    *****************************************
705    
706    Other assorted changes (by other people who did not supply a HISTORY entry):
707    
708    1. the cross-module inliner now works much better (Monnier)
709    2. representation of weights, frequencies, and probabilities in MLRISC
710       changed in preparation of using those for weighted block placement
711       (Reppy, George)
712    
713    ----------------------------------------------------------------------
714    Name: Lal George
715    Date: 2002/03/07 14:44:24 EST 2002
716    Tag: george-20020307-weighted-block-placement
717    
718    Tested the weighted block placement optimization on all architectures
719    (except the hppa) using AMPL to generate the block and edge frequencies.
720    Changes were required in the machine properties to correctly
721    categorize trap instructions. There is an MLRISC flag
722    "weighted-block-placement" that can be used to enable weighted block
723    placement, but this will be ineffective without block/edge
724    frequencies (coming soon).
725    
726    
727    ----------------------------------------------------------------------
728    Name: Lal George
729    Date: 2002/03/05 17:24:48 EST
730    Tag: george-20020305-linkage-cluster
731    
732    In order to support the block placement optimization, a new cluster
733    is generated as the very first cluster (called the linkage cluster).
734    It contains a single jump to the 'real' entry point for the compilation
735    unit. Block placement has no effect on the linkage cluster itself, but
736    all the other clusters  have full freedom in the manner in which they
737    reorder blocks or functions.
738    
739    On the x86 the typical linkage code that is generated is:
740       ----------------------
741            .align 2
742       L0:
743            addl    $L1-L0, 72(%esp)
744            jmp     L1
745    
746    
747            .align  2
748       L1:
749       ----------------------
750    
751    72(%esp) is the memory location for the stdlink register. This
752    must contain the address of the CPS function being called. In the
753    above example, it contains the address of  L0; before
754    calling L1 (the real entry point for the compilation unit), it
755    must contain the address for L1, and hence
756    
757            addl $L1-L0, 72(%esp)
758    
759    I have tested this on all architectures except the hppa.The increase
760    in code size is of course negligible
761    
762    ----------------------------------------------------------------------
763    Name: Allen Leung
764    Date: 2002/03/03 13:20:00 EST
765    Tag: leunga-20020303-mlrisc-tools
766    
767      Added #[ ... ] expressions to mlrisc tools
768    
769    ----------------------------------------------------------------------
770    Name: Matthias Blume
771    Date: 2002/02/27 12:29:00 EST
772    Tag: blume-20020227-cdebug
773    Description:
774    
775    - made types in structure C and C_Debug to be equal
776    - got rid of code duplication (c-int.sml vs. c-int-debug.sml)
777    - there no longer is a C_Int_Debug (C_Debug is directly derived from C)
778    
779    ----------------------------------------------------------------------
780    Name: Matthias Blume
781    Date: 2002/02/26 12:00:00 EST
782    Tag: blume-20020226-ffi
783    Description:
784    
785    1. Fixed a minor bug in CM's "noweb" tool:
786       If numbering is turned off, then truly don't number (i.e., do not
787       supply the -L option to noweb).  The previous behavior was to supply
788       -L'' -- which caused noweb to use the "default" line numbering scheme.
789       Thanks to Chris Richards for pointing this out (and supplying the fix).
790    
791    2. Once again, I reworked some aspects of the FFI:
792    
793       A. The incomplete/complete type business:
794    
795       - Signatures POINTER_TO_INCOMPLETE_TYPE and accompanying functors are
796         gone!
797       - ML types representing an incomplete type are now *equal* to
798         ML types representing their corresponding complete types (just like
799         in C).  This is still safe because ml-nlffigen will not generate
800         RTTI for incomplete types, nor will it generate functions that
801         require access to such RTTI.   But when ML code generated from both
802         incomplete and complete versions of the C type meet, the ML types
803         are trivially interoperable.
804    
805         NOTE:  These changes restore the full generality of the translation
806         (which was previously lost when I eliminated functorization)!
807    
808       B. Enum types:
809    
810       - Structure C now has a type constructor "enum" that is similar to
811         how the "su" constructor works.  However, "enum" is not a phantom
812         type because each "T enum" has values (and is isomorphic to
813         MLRep.Signed.int).
814       - There are generic access operations for enum objects (using
815         MLRep.Signed.int).
816       - ml-nlffigen will generate a structure E_foo for each "enum foo".
817         * The structure contains the definition of type "mlrep" (the ML-side
818         representation type of the enum).  Normally, mlrep is the same
819         as "MLRep.Signed.int", but if ml-nlffigen was invoked with "-ec",
820         then mlrep will be defined as a datatype -- thus facilitating
821         pattern matching on mlrep values.
822         ("-ec" will be suppressed if there are duplicate values in an
823          enumeration.)
824         * Constructors ("-ec") or values (no "-ec") e_xxx of type mlrep
825         will be generated for each C enum constant xxx.
826         * Conversion functions m2i and i2m convert between mlrep and
827         MLRep.Signed.int.  (Without "-ec", these functions are identities.)
828         * Coversion functions c and ml convert between mlrep and "tag enum".
829         * Access functions (get/set) fetch and store mlrep values.
830       - By default (unless ml-nlffigen was invoked with "-nocollect"), unnamed
831         enumerations are merged into one single enumeration represented by
832         structure E_'.
833    
834    ----------------------------------------------------------------------
835    Name: Allen Leung
836    Date: 2002/02/25 04:45:00 EST
837    Tag: leunga-20020225-cps-spill
838    
839    This is a new implementation of the CPS spill phase.
840    The new phase is in the new file compiler/CodeGen/cpscompile/spill-new.sml
841    In case of problems, replace it with the old file spill.sml
842    
843    The current compiler runs into some serious performance problems when
844    constructing a large record.  This can happen when we try to compile a
845    structure with many items.  Even a very simple structure like the following
846    makes the compiler slow down.
847    
848        structure Foo = struct
849           val x_1 = 0w1 : Word32.int
850           val x_2 = 0w2 : Word32.int
851           val x_3 = 0w3 : Word32.int
852           ...
853           val x_N = 0wN : Word32.int
854        end
855    
856    The following table shows the compile time, from N=1000 to N=4000,
857    with the old compiler:
858    
859    N
860    1000   CPS 100 spill                           0.04u  0.00s  0.00g
861           MLRISC ra                               0.06u  0.00s  0.05g
862              (spills = 0 reloads = 0)
863           TOTAL                                   0.63u  0.07s  0.21g
864    
865    1100   CPS 100 spill                           8.25u  0.32s  0.64g
866           MLRISC ra                               5.68u  0.59s  3.93g
867              (spills = 0 reloads = 0)
868           TOTAL                                   14.71u  0.99s  4.81g
869    
870    1500   CPS 100 spill                           58.55u  2.34s  1.74g
871           MLRISC ra                               5.54u  0.65s  3.91g
872              (spills = 543 reloads = 1082)
873           TOTAL                                   65.40u  3.13s  6.00g
874    
875    2000   CPS 100 spill                           126.69u  4.84s  3.08g
876           MLRISC ra                               0.80u  0.10s  0.55g
877              (spills = 42 reloads = 84)
878           TOTAL                                   129.42u  5.10s  4.13g
879    
880    3000   CPS 100 spill                           675.59u  19.03s  11.64g
881           MLRISC ra                               2.69u  0.27s  1.38g
882              (spills = 62 reloads = 124)
883           TOTAL                                   682.48u  19.61s  13.99g
884    
885    4000   CPS 100 spill                           2362.82u  56.28s  43.60g
886           MLRISC ra                               4.96u  0.27s  2.72g
887              (spills = 85 reloads = 170)
888           TOTAL                                   2375.26u  57.21s  48.00g
889    
890    As you can see the old cps spill module suffers from some serious
891    performance problem.  But since I cannot decipher the old code fully,
892    instead of patching the problems up, I'm reimplementing it
893    with a different algorithm.  The new code is more modular,
894    smaller when compiled, and substantially faster
895    (O(n log n) time and O(n) space).  Timing of the new spill module:
896    
897    4000  CPS 100 spill                           0.02u  0.00s  0.00g
898          MLRISC ra                               0.25u  0.02s  0.15g
899             (spills=1 reloads=3)
900          TOTAL                                   7.74u  0.34s  1.62g
901    
902    Implementation details:
903    
904    As far as I can tell, the purpose of the CPS spill module is to make sure the
905    number of live variables at any program point (the bandwidth)
906    does not exceed a certain limit, which is determined by the
907    size of the spill area.
908    
909    When the bandwidth is too large, we decrease the register pressure by
910    packing live variables into spill records.  How we achieve this is
911    completely different than what we did in the old code.
912    
913    First, there is something about the MLRiscGen code generator
914    that we should be aware of:
915    
916    o MLRiscGen performs code motion!
917    
918       In particular, it will move floating point computations and
919       address computations involving only the heap pointer to
920       their use sites (if there is only a single use).
921       What this means is that if we have a CPS record construction
922       statement
923    
924           RECORD(k,vl,w,e)
925    
926       we should never count the new record address w as live if w
927       has only one use (which is often the case).
928    
929       We should do something similar to floating point, but the transformation
930       there is much more complex, so I won't deal with that.
931    
932    Secondly, there are now two new cps primops at our disposal:
933    
934     1. rawrecord of record_kind option
935        This pure operator allocates some uninitialized storage from the heap.
936        There are two forms:
937    
938         rawrecord NONE [INT n]  allocates a tagless record of length n
939         rawrecord (SOME rk) [INT n] allocates a tagged record of length n
940                                     and initializes the tag.
941    
942     2. rawupdate of cty
943          rawupdate cty (v,i,x)
944          Assigns to x to the ith component of record v.
945          The storelist is not updated.
946    
947    We use these new primops for both spilling and increment record construction.
948    
949     1. Spilling.
950    
951        This is implemented with a linear scan algorithm (but generalized
952        to trees).  The algorithm will create a single spill record at the
953        beginning of the cps function and use rawupdate to spill to it,
954        and SELECT or SELp to reload from it.  So both spills and reloads
955        are fine-grain operations.  In contrast, in the old algorithm
956        "spills" have to be bundled together in records.
957    
958        Ideally, we should sink the spill record construction to where
959        it is needed.  We can even split the spill record into multiple ones
960        at the places where they are needed.  But CPS is not a good
961        representation for global code motion, so I'll keep it simple and
962        am not attempting this.
963    
964     2. Incremental record construction (aka record splitting).
965    
966        Long records with many component values which are simulatenously live
967        (recall that single use record addresses are not considered to
968         be live) are constructed with rawrecord and rawupdate.
969        We allocate space on the heap with rawrecord first, then gradually
970        fill it in with rawupdate.  This is the technique suggested to me
971        by Matthias.
972    
973        Some restrictions on when this is applicable:
974        1. It is not a VECTOR record.  The code generator currently does not handle
975           this case. VECTOR record uses double indirection like arrays.
976        2. All the record component values are defined in the same "basic block"
977           as the record constructor.  This is to prevent speculative
978           record construction.
979    
980    ----------------------------------------------------------------------
981    Name: Allen Leung
982    Date: 2002/02/22 01:02:00 EST
983    Tag: leunga-20020222-mlrisc-tools
984    
985    Minor bug fixes in the parser and rewriter
986    
987    ----------------------------------------------------------------------
988    Name: Allen Leung
989    Date: 2002/02/21 20:20:00 EST
990    Tag: leunga-20020221-peephole
991    
992    Regenerated the peephole files.  Some contained typos in the specification
993    and some didn't compile because of pretty printing bugs in the old version
994    of 'nowhere'.
995    
996    ----------------------------------------------------------------------
997    Name: Allen Leung
998    Date: 2002/02/19 20:20:00 EST
999    Tag: leunga-20020219-mlrisc-tools
1000    Description:
1001    
1002       Minor bug fixes to the mlrisc-tools library:
1003    
1004       1.  Fixed up parsing colon suffixed keywords
1005       2.  Added the ability to shut the error messages up
1006       3.  Reimplemented the pretty printer and fixed up/improved
1007           the pretty printing of handle and -> types.
1008       4.  Fixed up generation of literal symbols in the nowhere tool.
1009       5.  Added some SML keywords to to sml.sty
1010    
1011    ----------------------------------------------------------------------
1012    Name: Matthias Blume
1013    Date: 2002/02/19 16:20:00 EST
1014    Tag: blume-20020219-cmffi
1015    Description:
1016    
1017    A wild mix of changes, some minor, some major:
1018    
1019    * All C FFI-related libraries are now anchored under $c:
1020        $/c.cm      --> $c/c.cm
1021        $/c-int.cm  --> $c/internals/c-int.cm
1022        $/memory.cm --> $c/memory/memory.cm
1023    
1024    * "make" tool (in CM) now treats its argument pathname slightly
1025      differently:
1026        1. If the native expansion is an absolute name, then before invoking
1027           the "make" command on it, CM will apply OS.Path.mkRelative
1028           (with relativeTo = OS.FileSys.getDir()) to it.
1029        2. The argument will be passed through to subsequent phases of CM
1030           processing without "going native".  In particular, if the argument
1031           was an anchored path, then "make" will not lose track of that anchor.
1032    
1033    * Compiler backends now "know" their respective C calling conventions
1034      instead of having to be told about it by ml-nlffigen.  This relieves
1035      ml-nlffigen from one of its burdens.
1036    
1037    * The X86Backend has been split into X86CCallBackend and X86StdCallBackend.
1038    
1039    * Export C_DEBUG and C_Debug from $c/c.cm.
1040    
1041    * C type encoding in ml-nlffi-lib has been improved to model the conceptual
1042      subtyping relationship between incomplete pointers and their complete
1043      counterparts.  For this, ('t, 'c) ptr has been changed to 'o ptr --
1044      with the convention of instantiating 'o with ('t, 'c) obj whenever
1045      the pointer target type is complete.  In the incomplete case, 'o
1046      will be instantiated with some "'c iobj" -- a type obtained by
1047      using one of the functors PointerToIncompleteType or PointerToCompleteType.
1048    
1049      Operations that work on both incomplete and complete pointer types are
1050      typed as taking an 'o ptr while operations that require the target to
1051      be known are typed as taking some ('t, 'c) obj ptr.
1052    
1053      voidptr is now a bit "more concrete", namely "type voidptr = void ptr'"
1054      where void is an eqtype without any values.  This makes it possible
1055      to work on voidptr values using functions meant to operate on light
1056      incomplete pointers.
1057    
1058    * As a result of the above, signature POINTER_TO_INCOMPLETE_TYPE has
1059      been vastly simplified.
1060    
1061    ----------------------------------------------------------------------
1062    Name: Matthias Blume
1063    Date: 2002/02/19 10:48:00 EST
1064    Tag: blume-20020219-pqfix
1065    Description:
1066    
1067    Applied Chris Okasaki's bug fix for priority queues.
1068    
1069    ----------------------------------------------------------------------
1070    Name: Matthias Blume
1071    Date: 2002/02/15 17:05:00
1072    Tag: Release_110_39
1073    Description:
1074    
1075    Last-minute retagging is becoming a tradition... :-(
1076    
1077    This is the working release 110.39.
1078    
1079    ----------------------------------------------------------------------
1080    Name: Matthias Blume
1081    Date: 2002/02/15 16:00:00 EST
1082    Tag: Release_110_39-orig
1083    Description:
1084    
1085    Working release 110.39.  New bootfiles.
1086    
1087    (Update: There was a small bug in the installer so it wouldn't work
1088    with all shells.  So I retagged. -Matthias)
1089    
1090    ----------------------------------------------------------------------
1091    Name: Matthias Blume
1092    Date: 2002/02/15 14:17:00 EST
1093    Tag: blume-20020215-showbindings
1094    Description:
1095    
1096    Added EnvRef.listBoundSymbols and CM.State.showBindings.  Especially
1097    the latter can be useful for exploring what bindings are available at
1098    the interactive prompt.  (The first function returns only the list
1099    of symbols that are really bound, the second prints those but also the
1100    ones that CM's autoloading mechanism knows about.)
1101    
1102    ----------------------------------------------------------------------
1103    Name: Matthias Blume
1104    Date: 2002/02/15 12:08:00 EST
1105    Tag: blume-20020215-iptrs
1106    Description:
1107    
1108    Two improvements to ml-nlffigen:
1109    
1110      1. Write files only if they do not exist or if their current contents
1111         do not coincide with what's being written.  (That is, avoid messing
1112         with the time stamps unless absolutely necessary.)
1113    
1114      2. Implement a "repository" mechanism for generated files related
1115         to "incomplete pointer types".   See the README file for details.
1116    
1117    ----------------------------------------------------------------------
1118    Name: Matthias Blume
1119    Date: 2002/02/14 11:50:00 EST
1120    Tag: blume-20020214-quote
1121    Description:
1122    
1123    Added a type 't t_' to tag.sml (in ml-nlffi-lib.cm).  This is required
1124    because of the new and improved tag generation scheme.  (Thanks to Allen
1125    Leung for pointing it out.)
1126    
1127    ----------------------------------------------------------------------
1128    Name: Lal George
1129    Date: 2002/02/14 09:55:27 EST 2002
1130    Tag: george-20020214-isabelle-bug
1131    Description:
1132    
1133    Fixed the MLRISC bug sent by Markus Wenzel regarding the compilation
1134    of Isabelle on the x86.
1135    
1136    From Allen:
1137    -----------
1138     I've found the problem:
1139    
1140         in ra-core.sml, I use the counter "blocked" to keep track of the
1141         true number of elements in the freeze queue.  When the counter goes
1142         to zero, I skip examining the queue.  But I've messed up the
1143         bookkeeping in combine():
1144    
1145             else ();
1146             case !ucol of
1147               PSEUDO => (if !cntv > 0 then
1148                     (if !cntu > 0 then blocked := !blocked - 1 else ();
1149                                        ^^^^^^^^^^^^^^^^^^^^^^^
1150                      moveu := mergeMoveList(!movev, !moveu)
1151                     )
1152                  else ();
1153    
1154         combine() is called to coalesce two nodes u and v.
1155         I think I was thinking that if the move counts of u and v are both
1156         greater than zero then after they are coalesced then one node is
1157         removed from the freeze queue.  Apparently I was thinking that
1158         both u and v are of low degree, but that's clearly not necessarily true.
1159    
1160    
1161    02/12/2002:
1162        Here's the patch.  HOL now compiles.
1163    
1164        I don't know how this impact on performance (compile
1165        time or runtime).  This bug caused the RA (especially on the x86)
1166        to go thru the potential spill phase when there are still nodes on the
1167        freeze queue.
1168    
1169    
1170    
1171    
1172    ----------------------------------------------------------------------
1173    Name: Matthias Blume
1174    Date: 2002/02/13 22:40:00 EST
1175    Tag: blume-20020213-fptr-rtti
1176    Description:
1177    
1178    Fixed a bug in ml-nlffigen that was introduced with one of the previous
1179    updates.
1180    
1181    ----------------------------------------------------------------------
1182    Name: Matthias Blume
1183    Date: 2002/02/13 16:41:00 EST
1184    Tag: blume-20020213-cmlpq
1185    Description:
1186    
1187    Added new priority queue export symbols (which have just been added to
1188    smlnj-lib.cm) to CML's version of smlnj-lib.cm.  (Otherwise CML would
1189    not compile and the installer would choke.)
1190    
1191    ----------------------------------------------------------------------
1192    Name: Matthias Blume
1193    Date: 2002/02/13 16:15:00 EST
1194    Tag: blume-20020213-various
1195    Description:
1196    
1197    1. More tweaks to ml-nlffigen:
1198    
1199       - better internal datastructures (resulting in slight speedup)
1200       - "-match" option requires exact match
1201       - "localized" gensym counters (untagged structs/unions nested within
1202         other structs/unions or within typedefs get a fresh counter; their
1203         tag will be prefixed by a concatenation of their parents' tags)
1204       - bug fixes (related to calculation of transitive closure of types
1205         to be included in the output)
1206    
1207    2. Minor Basis updates:
1208    
1209       - added implementations for List.collate and Option.app
1210    
1211    ----------------------------------------------------------------------
1212    Name: Matthias Blume
1213    Date: 2002/02/11 15:55:00 EST
1214    Tag: blume-20020211-gensym
1215    Description:
1216    
1217    Added a "-gensym" option to command line of ml-nlffigen.  This can be
1218    used to specify a "stem" -- a string that is inserted in all "gensym'd"
1219    names (ML structure names that correspond to unnamed C structs, unions,
1220    and enums), so that separate runs of ml-nlffigen do not clash.
1221    
1222    ----------------------------------------------------------------------
1223    Name: Matthias Blume
1224    Date: 2002/02/11 12:05:00 EST
1225    Tag: blume-20020211-gensml
1226    Description:
1227    
1228    A quick fix for a problem with GenSML (in the pgraph-util library):
1229    Make generation of toplevel "local" optional.  (Strictly speaking,
1230    signature definitions within "local" are not legal SML.)
1231    
1232    Other than that: updates to INSTALL and cm/TODO.
1233    
1234    ----------------------------------------------------------------------
1235    Name: Matthias Blume
1236    Date: 2002/02/08 15:00:00 EST
1237    Tag: blume-20020208-uniquepid
1238    Description:
1239    
1240    0. Version number has been bumped to 110.38.1.  NEW BOOTFILES!!!
1241    
1242    1. The installer (config/install.sh) has gotten smarter:
1243    
1244         - Configuration options are a bit easier to specify now
1245           (in config/targets).
1246         - Bug in recognizing .tar.bz2 files fixed.
1247         - Installer automatically resolves dependencies between
1248           configuration options (e.g., if you ask for eXene, you will
1249           also get cml -- regardless whether you asked for it or not).
1250         - Installer can run in "quieter mode" by setting the environment
1251           variable INSTALL_QUIETLY to "true".  "Quieter" does not mean
1252           "completely silent", though.
1253         - Build HashCons library as part of smlnj-lib.
1254    
1255    2. A new scheme for assigning persistent identifiers to compilation
1256       units (and, by extension, to types etc.) has been put into place.
1257       This fixes a long-standing bug where types and even dynamic values
1258       can get internally confused, thereby compromising type safety
1259       (abstraction) and dynamic correctness.  See
1260    
1261         http://cm.bell-labs.com/cm/cs/who/blume/pid-confusion.tgz
1262    
1263       for an example of how things could go wrong until now.
1264    
1265       The downside of the new scheme is that pids are not quite as
1266       persistent as they used to be: CM will generate a fresh pid
1267       for every compilation unit that it thinks it sees for the first
1268       time.  That means that if you compile starting from a clean, fresh
1269       source tree at two different times, you end up with different
1270       binaries.
1271    
1272       Cutoff recompilation, however, has not been compromised because
1273       CM keeps pid information in special caches between runs.
1274    
1275    ----------------------------------------------------------------------
1276    Name: Lal George
1277    Date: 2002/02/07 15:34:13 EST 2002
1278    Tag: <none>
1279    Description:
1280    
1281    Compilers that generate assembly code may produce  global labels
1282    whose value is resolved at link time. The various peephole optimization
1283    modules did not take this in account.
1284    
1285    TODO. The Labels.addrOf function should really return an option
1286    type so that clients are forced to deal with this issue, rather
1287    than an exception being raised.
1288    
1289    ----------------------------------------------------------------------
1290    Name: Lal George
1291    Date: 2002/02/06 13:55:02 EST
1292    Tag: george-20020206-ra-breakup
1293    Description:
1294    
1295    1. A bug fix from Allen.
1296    
1297        A typo causes extra fstp %st(0)'s to be generated at compensation
1298        edges, which might cause stack underflow traps at runtime.  This
1299        occurs in fft where there are extraneous fstps right before the 'into'
1300        trap instruction (in this case they are harmless since none of the
1301        integers overflow.)
1302    
1303    2. Pulled out various utility modules that were embedded in the modules
1304       of the register allocator. I need these modules for other purposes, but
1305       they are not complete enough to put into a library (just yet).
1306    ----------------------------------------------------------------------
1307    Name: Matthias Blume
1308    Date: 2002/01/31 16:05:00 EST
1309    Tag: blume-20020131-sparc-ccalls
1310    Description:
1311    
1312    1. C-calls on Sparc needlessly allocated a huge chunk (96 bytes)
1313       of extra stack space by mistake.  Fixed.
1314    
1315    2. Bug in logic of handling of command-line options in ml-nlffigen fixed.
1316    
1317    ----------------------------------------------------------------------
1318    Name: Allen Leung
1319    Date: 2002/01/30
1320    Tag: leunga-20020130-nowhere-bug-fix
1321    Description:
1322    
1323       MLRISC bug fixes:
1324       1. Fixed a bindings computation bug in the 'nowhere' program generator tool.
1325       2. MachineInt.fromString was negating its value.
1326    
1327    ----------------------------------------------------------------------
1328    Name: Matthias Blume
1329    Date: 2002/01/29
1330    Tag: blume-20020129-INSTALL
1331    Description:
1332    
1333    - Added somewhat detailed installation instructions (file INSTALL).
1334    - Fixed curl-detection bug in config/install.sh.
1335    - It is now possible to select the URL getter using the URLGETTER
1336      environment variable:
1337    
1338          not set / "unknown"      --> automatic detection (script tries wget,
1339                                       curl, and lynx)
1340          "wget" / "curl" / "lynx" --> use the specified program (script "knows"
1341                                       how to properly invoke them)
1342          other                    --> use $URLGETTER directly, it must take
1343                                       precisely two command-line arguments
1344                                       (source URL and destination file name)
1345    
1346    ----------------------------------------------------------------------
1347    Name: Matthias Blume
1348    Date: 2002/01/28
1349    Tag: blume-20020128-sparc-ccalls
1350    Description:
1351    
1352    - Fixed problem with calculation of "used" registers in sparc-c-calls.
1353    - Make use of the allocParam argument in sparc-c-calls.
1354    
1355    ----------------------------------------------------------------------
1356    Name: Matthias Blume
1357    Date: 2002/01/28
1358    Tag: blume-20020128-allocParam
1359    Description:
1360    
1361    John Reppy:  Changes c-calls API to accept client-callback for
1362    allocating extra stack space.
1363    me: Corresponding changes to mlriscGen (using a dummy argument that
1364        does not change the current behavior).
1365    
1366    ----------------------------------------------------------------------
1367    Name: Matthias Blume
1368    Date: 2002/01/28 12:00:00
1369    Tag: Release_110_38
1370    Description:
1371    
1372    This time for real!!!
1373    
1374    ----------------------------------------------------------------------
1375    Name: Matthias Blume
1376    Date: 2002/01/28 10:56:00 EST
1377    Tag: blume-20020128-retraction
1378    Description:
1379    
1380    0. Retracted earlier 110.38.  (The Release_110_38 tag has been replaced
1381       with blume-Release_110_38-retracted.)
1382    
1383    1. Fixed a problem with incorrect rounding modes in real64.sml.
1384       (Thanks to Andrew Mccreight <andrew.mccreight@yale.edu>.)
1385    
1386    2. A bug in ml-nlffigen related to the handling of unnamed structs, unions,
1387       and enums fixed.  The naming of corresponding ML identifiers should
1388       now be consistent again.
1389    
1390    ----------------------------------------------------------------------
1391    Name: Allen Leung
1392    Date: 2002/01/27
1393    Tag: leunga-20020127-nowhere
1394    Description:
1395    
1396       Added a target called nowhere in the configuration scripts.
1397       Enabling this will build the MLRISC 'nowhere' tool (for translating
1398       programs with where-clauses into legal SML code) during installation.
1399    
1400    ----------------------------------------------------------------------
1401    Name: Matthias Blume
1402    Date: 2002/01/25 21:27:00 EST
1403    Tag: blume-Release_110_38-retracted
1404    Description:
1405    
1406    Call it a (working) release!  Version is 110.38. Bootfiles are ready.
1407    
1408    README will be added later.
1409    
1410    !!! NOTE:  Re-tagged as blume-Release_110_38-retracted. Original tag
1411    (Release_110_38) removed.  Reason: Last-minute bug fixes.
1412    
1413    ----------------------------------------------------------------------
1414    Name: Matthias Blume
1415    Date: 2002/01/25
1416    Tag: blume-20020125-ffi
1417    Description:
1418    
1419    A large number of tweaks and improvements to ml-nlffi-lib and
1420    ml-nlffigen:
1421    
1422       - ML represenation types have been streamlined
1423       - getter and setter functions work with concrete values, not abstract
1424         ones where possible
1425       - ml-nlffigen command line more flexible (see README file there)
1426       - some bugs have been fixed (hopefully)
1427    
1428    ----------------------------------------------------------------------
1429    Name: Lal George
1430    Date: 2002/01/24
1431    Tag: george-20020124-risc-ra-interface
1432    Description:
1433    
1434       There is a dramatic simplification in the interface to the
1435       register allocator for RISC architectures as a result of making
1436       parallel copy instructions explicit.
1437    
1438    ----------------------------------------------------------------------
1439    Name: Matthias Blume
1440    Date: 2002/01/22
1441    Tag: blume-20020122-x86-ccalls
1442    Description:
1443    
1444    Bug fix for c-calls on x86 (having to do with how char- and
1445    short-arguments are being handled).
1446    
1447    ----------------------------------------------------------------------
1448    Name: Matthias Blume
1449    Date: 2002/01/21
1450    Tag: blume-20020121-ff
1451    Description:
1452    
1453    Another day of fiddling with the FFI...
1454    
1455    1. Bug fix/workaround:  CKIT does not complain about negative array
1456       dimensions, so ml-nlffigen has to guard itself against this possibility.
1457       (Otherwise a negative dimension would send it into an infinite loop.)
1458    
1459    2. Some of the abstract types (light objects, light pointers, most "base"
1460       types) in structure C are now eqtypes.
1461    
1462    3. Added constructors and test functions for NULL function pointers.
1463    
1464    ----------------------------------------------------------------------
1465    Name: Matthias Blume
1466    Date: 2002/01/18
1467    Tag: blume-20020118-ready-for-new-release
1468    Description:
1469    
1470    Made config/srcarchiveurl point to a new place.  (Will provide boot
1471    files shortly.)
1472    
1473    Maybe we christen this to be 110.38?
1474    
1475    ----------------------------------------------------------------------
1476    Name: Matthias Blume
1477    Date: 2002/01/18
1478    Tag: blume-20020118-more-ffifiddle
1479    Description:
1480    
1481    Today's FFI fiddling:
1482    
1483      - Provided a structure CGetSet with "convenient" versions of C.Get.* and
1484        C.Set.* that use concrete (MLRep.*) arguments and results instead
1485        of abstract ones.
1486    
1487      - Provided word-style bit operations etc. for "int" representation
1488        types in MLRep.S<Foo>Bitops where <Foo> ranges over Char, Int, Short,
1489        and Long.
1490    
1491    ----------------------------------------------------------------------
1492    Name: Matthias Blume
1493    Date: 2002/01/18
1494    Tag: blume-20020118-use-x86-fp
1495    Description:
1496    
1497    Now that x86-fast-fp seems to be working, I turned it back on again
1498    by default.  (Seems to work fine now, even with the FFI.)
1499    
1500    Other than that, I added some documentation about the FFI to
1501    src/ml-nlffigen/README and updated the FFI test examples in
1502    src/ml-nlffi-lib/Tests/*.
1503    
1504    ----------------------------------------------------------------------
1505    Name: Allen Leung
1506    Date: 2002/01/17
1507    Tag: leunga-20020117-x86-fast-fp-call
1508    Description:
1509    
1510       1. Fixed a problem with handling return fp values when x86's fast fp
1511          mode is turned on.
1512    
1513       2. Minor pretty printing fix for cellset.  Print %st(0) as %st(0) instead
1514          of %f32.
1515    
1516       3. Added a constructor INT32lit to the ast of MLRISC tools.
1517    
1518    ----------------------------------------------------------------------
1519    Name: Matthias Blume
1520    Date: 2002/01/16
1521    Tag: blume-20020116-ffifiddle
1522    Description:
1523    
1524    More fiddling with the FFI interface:
1525    
1526     - Make constness 'c instead of rw wherever possible.  This eliminates
1527       the need for certain explicit coercions.  (However, due to ML's
1528       value polymorphism, there will still be many cases where explicit
1529       coercions are necessary.  Phantom types are not the whole answer
1530       to modeling a subtyping relationship in ML.)
1531    
1532     - ro/rw coersions for pointers added.  (Avoids the detour through */&.)
1533    
1534     - "printf" test example added to src/ml-nlffi-lib/Tests.  (Demonstrates
1535       clumsy workaround for varargs problem.)
1536    
1537    ----------------------------------------------------------------------
1538    Name: Lal George
1539    Date: 2002/01/15
1540    Tag: <none>
1541    Description:
1542    
1543    1. Since COPY instructions are no longer native to the architecture,
1544       a generic functor can be used to implement the expandCopies function.
1545    
1546    2. Allowed EXPORT and IMPORT pseudo-op declarations to appear inside a
1547       TEXT segment.
1548    
1549    ----------------------------------------------------------------------
1550    Name: Matthias Blume
1551    Date: 2002/01/15
1552    Tag: blume-20020115-ffiupdates
1553    Description:
1554    
1555    1. Fix for bug resulting in single-precision float values being returned
1556       incorrectly from FFI calls.
1557    
1558    2. Small modifications to C FFI API:
1559    
1560        - memory-allocation routines return straight objects (no options)
1561          and raise an exception in out-of-memory situations
1562        - unsafe extensions to cast between function pointers and pointers
1563          from/to ints
1564        - added structure C_Debug as an alternative to structure C where
1565          pointer-dereferencing (|*| and |*!) always check for null-pointers
1566        - added open_lib' to DynLinkage;  open_lib' works like open_lib
1567          but also takes a (possibly empty) list of existing library handles
1568          that the current library depends on
1569    
1570    ----------------------------------------------------------------------
1571    Name: Matthias Blume
1572    Date: 2002/01/10
1573    Tag: blume-20020110-newffigen
1574    Description:
1575    
1576    1. Updates to portable graph code.
1577    
1578    2. Major update to ml-nlffigen and ml-nlffi-lib.  Things are much
1579       more scalable now so that even huge interfaces such as the one
1580       for GTK compile in finite time and space. :-)
1581       See src/ml-nlffigen/README for details on what's new.
1582    
1583    ----------------------------------------------------------------------
1584    Name: Lal George
1585    Date: 2001/01/09 14:31:35 EST 2002
1586    Tag: george-20011206-rm-native-copy
1587    Description:
1588    
1589            Removed the native COPY and FCOPY instructions
1590            from all the architectures and replaced it with the
1591            explicit COPY instruction from the previous commit.
1592    
1593            It is now possible to simplify many of the optimizations
1594            modules that manipulate copies. This has not been
1595            done in this change.
1596    
1597    ----------------------------------------------------------------------
1598    Name: Lal George
1599    Date: 2001/12/06 16:50:13 EST 2001
1600    Tag: george-20011206-mlrisc-instruction
1601    Description:
1602    
1603    Changed the representation of instructions from being fully abstract
1604    to being partially concrete. That is to say:
1605    
1606      from
1607            type instruction
1608    
1609      to
1610            type instr                              (* machine instruction *)
1611    
1612            datatype instruction =
1613                LIVE of {regs: C.cellset, spilled: C.cellset}
1614              | KILL of {regs: C.cellset, spilled: C.cellset}
1615              | COPYXXX of {k: CB.cellkind, dst: CB.cell list, src: CB.cell list}
1616              | ANNOTATION of {i: instruction, a: Annotations.annotation}
1617              | INSTR of instr
1618    
1619    This makes the handling of certain special instructions that appear on
1620    all architectures easier and uniform.
1621    
1622    LIVE and KILL say that a list of registers are live or killed at the
1623    program point where they appear. No spill code is generated when an
1624    element of the 'regs' field is spilled, but the register is moved to
1625    the 'spilled' (which is present, more for debugging than anything else).
1626    
1627    LIVE replaces the (now deprecated) DEFFREG instruction on the alpha.
1628    We used to generate:
1629    
1630            DEFFREG f1
1631            f1 := f2 + f3
1632            trapb
1633    
1634    but now generate:
1635    
1636            f1 := f2 + f3
1637            trapb
1638            LIVE {regs=[f1,f2,f3], spilled=[]}
1639    
1640    Furthermore, the DEFFREG (hack) required that all floating point instruction
1641    use all registers mentioned in the instruction. Therefore f1 := f2 + f3,
1642    defines f1 and uses [f1,f2,f3]! This hack is no longer required resulting
1643    in a cleaner alpha implementation. (Hopefully, intel will not get rid of
1644    this architecture).
1645    
1646    COPYXXX is intended to replace the parallel COPY and FCOPY  available on
1647    all the architectures. This will result in further simplification of the
1648    register allocator that must be aware of them for coalescing purposes, and
1649    will also simplify certain aspects of the machine description that provides
1650    callbacks related to parallel copies.
1651    
1652    ANNOTATION should be obvious, and now INSTR represents the honest to God
1653    machine instruction set!
1654    
1655    The <arch>/instructions/<arch>Instr.sml files define certain utility
1656    functions for making porting easier -- essentially converting upper case
1657    to lower case. All machine instructions (of type instr) are in upper case,
1658    and the lower case form generates an MLRISC instruction. For example on
1659    the alpha we have:
1660    
1661      datatype instr =
1662         LDA of {r:cell, b:cell, d:operand}
1663       | ...
1664    
1665      val lda : {r:cell, b:cell, d:operand} -> instruction
1666        ...
1667    
1668    where lda is just (INSTR o LDA), etc.
1669    
1670    ----------------------------------------------------------------------
1671    Name: Matthias Blume
1672    Date: 2001/11/22 21:40:00 EST
1673    Tag: Release_110_37
1674    Description:
1675    
1676    Release 110.37.  This time for real.
1677    
1678    ----------------------------------------------------------------------
1679    Name: Matthias Blume
1680    Date: 2001/11/21 16:35:00 EST
1681    Tag: blume-20011121-foot-in-mouth
1682    Description:
1683    
1684    Removed the "Release_110_37" tag because of a serious bug.
1685    This will be re-tagged once the bug is fixed.
1686    
1687    ----------------------------------------------------------------------
1688    Name: Matthias Blume
1689    Date: 2001/11/21 16:14:00 EST
1690    Tag: blume-20011121-forgottenfile
1691    Description:
1692    
1693    Forgot to add a file.  (Just a .tex-file -- part of
1694    the CM manual source.)
1695    
1696    ----------------------------------------------------------------------
1697    Name: Matthias Blume
1698    Date: 2001/11/21 16:10:00 EST
1699    Tag: blume-20011121-invalid_110_37
1700    Description:
1701    
1702    Note: I removed the original tag "Release_110_37" from this commit
1703          because we found a serious bug in all non-x86 backends.
1704          - Matthias
1705    
1706    1. Modifications to the SML/NJ code generator and to the runtime system
1707       so that code object name strings are directly inserted into code
1708       objects at code generation time.  The only business the runtime system
1709       has with this is now to read the name strings on occasions.
1710       (The encoding of the name string has also changed somewhat.)
1711    
1712    2. CM now implements a simple "set calculus" for specifying export lists.
1713       In particular, it is now possible to refer to the export lists of
1714       other libraries/groups/sources and form unions as well as differences.
1715       See the latest CM manual for details.
1716    
1717    3. An separate notion of "proxy" libraries has again be eliminated from
1718       CM's model.  (Proxy libraries are now simply a special case of using
1719       the export list calculus.)
1720    
1721    4. Some of the existing libraries now take advantage of the new set
1722       calculus.
1723       (Notice that not all libraries have been converted because some
1724       of the existing .cm-files are supposed to be backward compatible
1725       with 110.0.x.)
1726    
1727    5. Some cleanup in stand-alone programs.  (Don't use "exnMessage" -- use
1728       "General.exnMessage"!  The former relies on a certain hook to be
1729       initialized, and that often does not happen in the stand-alone case.)
1730    
1731    ----------------------------------------------------------------------
1732    Name: Lal George
1733    Date: 2001/11/21  13:56:18 EST
1734    Tag: george-2001121-pseudo-ops
1735    Description:
1736    
1737      Implemented a complete redesign of MLRISC pseudo-ops. Now there
1738      ought to never be any question of incompatabilities with
1739      pseudo-op syntax expected by host assemblers.
1740    
1741      For now, only modules supporting GAS syntax are implemented
1742      but more should follow, such as MASM, and vendor assembler
1743      syntax, e.g. IBM as, Sun as, etc.
1744    
1745    ----------------------------------------------------------------------
1746    Name: Matthias Blume
1747    Date: 2001/11/14 11:52:00 EST
1748    Tag: blume-20011114-srcname
1749    Description:
1750    
1751    1. Routed the name of the current source file to mlriscgen where it
1752       should be directly emitted into the code object.  (This last part
1753       is yet to be done.)
1754    
1755    2. Some cleanup of the pgraph code to make it match the proposal that
1756       I put out the other day.  (The proposal notwithstanding, things are
1757       still in flux here.)
1758    
1759    ----------------------------------------------------------------------
1760    Name: Lal George
1761    Date: 2001/11/14 09:44:04 EST
1762    Tag:
1763    Description:
1764    
1765      Fix for a backpatching bug reported by Allen.
1766    
1767      Because the boundary between short and long span-dependent
1768      instructions is +/- 128, there are an astounding number of
1769      span-dependent instructions whose size is over estimated.
1770    
1771      Allen came up with the idea of letting the size of span
1772      dependent instructions be non-monotonic, for a maxIter
1773      number of times, after which the size must be monotonically
1774      increasing.
1775    
1776      This table shows the number of span-dependent instructions
1777      whose size was over-estimated as a function of maxIter, for the
1778      file Parse/parse/ml.grm.sml:
1779    
1780         maxIter            # of instructions:
1781            10                      687
1782            20                      438
1783            30                      198
1784            40                        0
1785    
1786      In compiling the compiler, there is no significant difference in
1787      compilation speed between maxIter=10 and maxIter=40. Actually,
1788      my measurements showed that maxIter=40 was a tad faster than
1789      maxIter=10! Also 96% of the  files in the compiler reach a fix
1790      point within 13 iterations, so fixing maxIter at 40, while high,
1791      is okay.
1792    
1793    ----------------------------------------------------------------------
1794    Name: Matthias Blume
1795    Date: 2001/10/31 15:25:00 EST
1796    Tag: blume-20011031-pgraph
1797    Description:
1798    
1799    CKIT:
1800    * Changed the "Function" constructor of type Ast.ctype to carry optional
1801      argument identifiers.
1802    * Changed the return type of TypeUtil.getFunction accordingly.
1803    * Type equality ignores the argument names.
1804    * TypeUtil.composite tries to preserve argument names but gives up quickly
1805      if there is a mismatch.
1806    
1807    installation script:
1808    * attempts to use "curl" if available (unless "wget" is available as well)
1809    
1810    CM:
1811    * has an experimental implementation of "portable graphs" which I will
1812      soon propose as an implementation-independent library format
1813    * there are also new libraries $/pgraph.cm and $/pgraph-util.cm
1814    
1815    NLFFI-LIB:
1816    * some cleanup (all cosmetic)
1817    
1818    NLFFIGEN:
1819    * temporarily disabled the mechanism that suppresses ML output for
1820      C definitions whose identifiers start with an underscore character
1821    * generate val bindings for enum constants
1822    * user can request that only one style (light or heavy) is being used;
1823      default is to use both (command-line arguments: -heavy and -light)
1824    * fixed bug in handling of function types involving incomplete pointers
1825    * generate ML entry points that take record arguments (i.e., using
1826      named arguments) for C functions that have a prototype with named
1827      arguments
1828      (see changes to CKIT)
1829    
1830    ----------------------------------------------------------------------
1831  Name: Allen Leung  Name: Allen Leung
1832  Date: 2001/10/27 20:34:00 EDT  Date: 2001/10/27 20:34:00 EDT
1833  Tag: leunga-20011027-x86-fast-fp-call  Tag: leunga-20011027-x86-fast-fp-call
# Line 109  Line 1924 
1924    
1925  ----------------------------------------------------------------------  ----------------------------------------------------------------------
1926  Name: Matthias Blume  Name: Matthias Blume
 >>>>>>> 1.169  
1927  Date: 2001/09/18 15:35:00 EDT  Date: 2001/09/18 15:35:00 EDT
1928  Tag: blume-20010918-readme11036  Tag: blume-20010918-readme11036
1929  Description:  Description:

Legend:
Removed from v.972  
changed lines
  Added in v.1185

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0