Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1069, Fri Feb 15 21:00:05 2002 UTC revision 1208, Tue May 21 02:22:43 2002 UTC
# Line 14  Line 14 
14    
15  ----------------------------------------------------------------------  ----------------------------------------------------------------------
16  Name: Matthias Blume  Name: Matthias Blume
17  Date: 2002/02/15 16:00:00 EST  Date: 2002/05/20 22:25:00 EDT
18    Tag: blume-20020520-controls
19    Description:
20    
21    - Added ControlUtil structure to control-lib.cm.
22    - Use it throughout.
23    - Used Controls facility to define MLRISC controls (as opposed to
24      registering MLRISC control ref cells with Controls after the fact)
25    - Fixed messed-up controls priorities.
26    
27    * Removed again all the stuff from config/preloads that one wouldn't
28      be able to preload at the time the initial heap image is built.
29      (Many libraries, e.g., CML, do not exist yet at this time.  The
30       only libraries that can be preloaded via config/preloads are those
31       that come bundled with the bootfiles.)
32    
33    ----------------------------------------------------------------------
34    Name: Matthias Blume
35    Date: 2002/05/20 10:59:00 EDT
36    Tag: blume-20020520-preloads
37    Description:
38    
39    Added a lot of commented-out suggestions for things to be included
40    in config/preloads.
41    
42    ----------------------------------------------------------------------
43    Name: Allen Leung
44    Date: 2002/05/18 14:20:00 EDT
45    Tag: leunga-20020518-mdl
46    Description:
47    
48    o Made the mdl tool stuff compile and run again.
49    o I've disabled all the stuff that depends on RTL specifications; they
50      are all badly broken anyway.
51    
52    ----------------------------------------------------------------------
53    Name: Matthias Blume
54    Date: 2002/05/17 16:49:00 EDT
55    Tag: blume-20020517-controls
56    Description:
57    
58    0. John Reppy made several modifications to the SML/NJ library.
59       In particular, there is a shiny new controls-lib.cm.
60    
61    1. Pushed new controls interface through compiler so that everything
62       compiles again.
63    
64    2. Added FormatComb and FORMAT_COMB to the CML version of the
65       SML/NJ library (so that CML compiles again).
66    
67    3. Modified init scripts because XXX_DEFAULT environment variables
68       are no longer with us.  (Boot-time initialization is now done
69       using the same environment variables that are also used for
70       startup-time initialization of controls.)
71    
72    ----------------------------------------------------------------------
73    Name: Lal George
74    Date: 2002/05/15 09:20:10 EDT
75    Tag: george-20020515-pseudo-op-decls
76    Description:
77    
78            All pseudo-ops emitted before the first segment declaration
79    such as TEXT, DATA, and BSS directives are assumed to be global
80    declarations and are emitted first in the assembly file. This is
81    useful in a number of situations where one has pseudo-ops that are not
82    specific to any segment, and also works around the constraint that one
83    cannot have client pseudo-ops in the TEXT segment.
84    
85            Because no segment is associated with these declarations it is
86    an error to allocate any space or objects before the first segment
87    directive and an exception will be raised. However, we cannot make
88    this check for client pseudo-ops.
89    
90            These top level declarations are a field in the CFG graph_info.
91    In theory you can continue to add to this field after the CFG has been
92    built -- provided you know what you are doing;-)
93    
94    ----------------------------------------------------------------------
95    Name: Matthias Blume
96    Date: 2002/05/13 16:40:00 EDT
97    Tag: blume-20020513-pp-etc
98    Description:
99    
100    A few minor bugfixes:
101    
102      - Stopgap measure for bug recently reported by Elsa Gunter (ppDec).
103        (Bogus printouts for redefined bindings still occur.  Compiler
104        bug should no longer occur now.  We need to redo the prettyprinter
105        from scratch.)
106    
107      - CM pathname printer now also adds escape sequences for ( and )
108    
109      - commend and docu fixes for ml-nlffi
110    
111    ----------------------------------------------------------------------
112    Name: Matthias Blume
113    Date: 2002/05/10 16:40:00 EDT
114    Tag: blume-20020510-erg-textio
115    Description:
116    
117    Applied the following bugfix provided by Emden Gansner:
118    
119        Output is corrupted when outputSubstr is used rather than output.
120    
121        The problem occurs when a substring
122    
123            ss = (s, dataStart, dataLen)
124    
125        where dataStart > 0, fills a stream buffer with avail bytes left.
126        avail bytes of s, starting at index dataStart, are copied into the
127        buffer, the buffer is flushed, and then the remaining dataLen-avail
128        bytes of ss are copied into the beginning of the buffer. Instead of
129        starting this copy at index dataStart+avail in s, the current code
130        starts the copy at index avail.
131    
132        Fix:
133        In text-io-fn.sml, change line 695 from
134             val needsFlush = copyVec(v, avail, dataLen-avail, buf, 0)
135        to
136             val needsFlush = copyVec(v, dataStart+avail, dataLen-avail, buf, 0)
137    
138    ----------------------------------------------------------------------
139    Name: Matthias Blume
140    Date: 2002/04/12 13:55:00 EDT
141    Tag: blume-20020412-assyntax
142    Description:
143    
144    1. Grabbed newer assyntax.h from the XFree86 project.
145    2. Fiddled with how to compile X86.prim.asm without warnings.
146    3. (Very) Minor cleanup in CM.
147    
148    ----------------------------------------------------------------------
149    Name: Matthias Blume
150    Date: 2002/04/01 (no joke!) 17:07:00 EST
151    Tag: blume-20020401-x86div
152    Description:
153    
154    Added full support for div/mod/rem/quot on the x86, using the machine
155    instruction's two results (without clumsily recomputing the remainder)
156    directly where appropriate.
157    
158    Some more extensive power-of-two support was added to the x86 instruction
159    selector (avoiding expensive divs, mods, and muls where they can be
160    replaced with cheaper shifts and masks).  However, this sort of thing
161    ought to be done earlier, e.g., within the CPS optimizer so that
162    all architectures benefit from it.
163    
164    The compiler compiles to a fixed point, but changes might be somewhat
165    fragile nevertheless.  Please, report any strange things that you might
166    see wrt. div/mod/quot/rem...
167    
168    ----------------------------------------------------------------------
169    Name: Matthias Blume
170    Date: 2002/03/29 17:22:00
171    Tag: blume-20020329-div
172    Description:
173    
174    Fixed my broken div/mod logic.  Unfortunately, this means that the
175    inline code for div/mod now has one more comparison than before.
176    Fast paths (quotient > 0 or remainder = 0) are not affected, though.
177    The problem was with quotient = 0, because that alone does not tell
178    us which way the rounding went.  One then has to look at whether
179    remainder and divisor have the same sign...  :(
180    
181    Anyway, I replaced the bootfiles with fresh ones...
182    
183    ----------------------------------------------------------------------
184    Name: Matthias Blume
185    Date: 2002/03/29 14:10:00 EST
186    Tag: blume-20020329-inlprims
187    Description:
188    
189    NEW BOOTFILES!!!    Version number bumped to 110.39.3.
190    
191    Primops have changed. This means that the bin/boot-file formats have
192    changed as well.
193    
194    To make sure that there is no confusion, I made a new version.
195    
196    
197    CHANGES:
198    
199    * removed REMT from mltree (remainder should never overflow).
200    
201    * added primops to deal with divisions of all flavors to the frontend
202    
203    * handled these primops all the way through so they map to their respective
204      MLRISC support
205    
206    * used these primops in the implementation of Int, Int32, Word, Word32
207    
208    * removed INLDIV, INLMOD, and INLREM as they are no longer necessary
209    
210    * parameterized INLMIN, INLMAX, and INLABS by a numkind
211    
212    * translate.sml now deals with all flavors of INL{MIN,MAX,ABS}, including
213      floating point
214    
215    * used INL{MIN,MAX,ABS} in the implementation of Int, Int32, Word, Word32,
216      and Real (but Real.abs maps to a separate floating-point-only primop)
217    
218    
219    TODO items:
220    
221    * Hacked Alpha32 instruction selection, disabling the selection of REMx
222      instructions because the machine instruction encoder cannot handle
223      them.  (Hppa, PPC, and Sparc instruction selection did not handle
224      REM in the first place, and REM is supported by the x86 machine coder.)
225    
226    * Handle DIV and MOD with DIV_TO_NEGINF directly in the x86 instruction
227      selection phase.  (The two can be streamlined because the hardware
228      delivers both quotient and remainder at the same time anyway.)
229    
230    * Think about what to do with "valOf(Int32.minInt) div ~1" and friends.
231      (Currently the behavior is inconsistent both across architectures and
232      wrt. the draft Basis spec.)
233    
234    * Word8 should eventually be handled natively, too.
235    
236    * There seems to be one serious bug in mltree-gen.sml.  It appears, though,
237      as if there currently is no execution path that could trigger it in
238      SML/NJ.  (The assumptions underlying functions arith and promotable do not
239      hold for things like multiplication and division.)
240    
241    ----------------------------------------------------------------------
242    Name: Matthias Blume
243    Date: 2002/03/27 16:27:00 EST
244    Tag: blume-20020327-mlrisc-divisions
245    Description:
246    
247    Added support for all four division operations (ML's div, mod, quot,
248    and rem) to MLRISC.  In the course of doing so, I also rationalized
249    the naming (no more annoying switch-around of DIV and QUOT), by
250    parameterizing the operation by div_rounding_mode (which can be either
251    DIV_TO_ZERO or DIV_TO_NEGINF).
252    
253    The generic MLTreeGen functor takes care of compiling all four
254    operations down to only round-to-zero div.
255    
256    Missing pieces:
257    
258      * Doing something smarter than relying on MLTreeGen on architectures
259        like, e.g., the x86 where hardware division delivers both quotient and
260        remainder at the same time.  With this, the implementation of the
261        round-to-neginf operations could be further streamlined.
262    
263      * Remove inlining support for div/mod/rem from the frontend and replace it
264        with primops that get carried through to the backend.  Do this for all
265        int and word types.
266    
267    ----------------------------------------------------------------------
268    Name: Matthias Blume
269    Date: 2002/03/25 17:25:00 EST
270    Tag: blume-20020325-divmod
271    Description:
272    
273    I improved (hopefully without breaking them) the implementation of Int.div,
274    Int.mod, and Int.rem.   For this, the code in translate.sml now takes
275    advantage of the following observations:
276    
277      Let  q = x quot y      r = x rem y
278           d = x div  y      m = x mod y
279    
280    where "quot" is the round-to-zero version of integer division that
281    hardware usually provides.  Then we have:
282    
283         r = x - q * y        where neither the * nor the - will overflow
284         d = if q >= 0 orelse x = q * y then q else q - 1
285                              where neither the * nor the - will overflow
286         m = if q >= 0 orelse r = 0 then r else r + y
287                              where the + will not overflow
288    
289    This results in substantial simplification of the generated code.
290    The following table shows the number of CFG nodes and edges generated
291    for
292            fun f (x, y) = x OPER y
293            (* with OPER \in div, mod, quot, rem *)
294    
295    
296        OPER | nodes(old) | edges(old) | nodes(new) | edges(new)
297        --------------------------------------------------------
298         div |         24 |         39 |         12 |         16
299         mod |         41 |         71 |         12 |         16
300        quot |          8 |         10 |          8 |         10
301         rem |         10 |         14 |          8 |         10
302    
303    
304    ----------------------------------------------------------------------
305    Name: Matthias Blume
306    Date: 2002/03/25 22:06:00 EST
307    Tag: blume-20020325-cprotobug
308    Description:
309    
310    Fixed a bug in cproto (c prototype decoder).
311    
312    ----------------------------------------------------------------------
313    Name: Matthias Blume
314    Date: 2002/03/25 16:00:00 EST
315    Tag: blume-20020325-raw-primops
316    Description:
317    
318    I did some cleanup to Allen's new primop code and
319    replaced yesterday's bootfiles with new ones.
320    (But they are stored in the same place.)
321    
322    ----------------------------------------------------------------------
323    Name: Matthias Blume
324    Date: 2002/03/24 22:40:00 EST
325    Tag: blume-20020324-bootfiles
326    Description:
327    
328    Made the bootfiles that Allen asked for.
329    
330    ----------------------------------------------------------------------
331    Name: Allen Leung
332    Date: 2002/03/23 15:50:00 EST
333    Tag: leunga-20020323-flint-cps-rcc-primops
334    Description:
335    
336      1. Changes to FLINT primops:
337    
338        (* make a call to a C-function;
339         * The primop carries C function prototype information and specifies
340         * which of its (ML-) arguments are floating point. C prototype
341         * information is for use by the backend, ML information is for
342         * use by the CPS converter. *)
343      | RAW_CCALL of { c_proto: CTypes.c_proto,
344                       ml_args: ccall_type list,
345                       ml_res_opt: ccall_type option,
346                       reentrant : bool
347                     } option
348       (* Allocate uninitialized storage on the heap.
349        * The record is meant to hold short-lived C objects, i.e., they
350        * are not ML pointers.  With the tag, the representation is
351        * the same as RECORD with tag tag_raw32 (sz=4), or tag_fblock (sz=8)
352        *)
353      | RAW_RECORD of {tag:bool,sz:int}
354      and ccall_type = CCALL_INT32 | CCALL_REAL64 | CCALL_ML_PTR
355    
356      2.  These CPS primops are now overloaded:
357    
358           rawload of {kind:numkind}
359           rawstore of {kind:numkind}
360    
361          The one argument form is:
362    
363             rawload {kind} address
364    
365          The two argument form is:
366    
367             rawload {kind} [ml object, byte-offset]
368    
369      3. RAW_CCALL/RCC now takes two extra arguments:
370    
371         a. The first is whether the C call is reentrant, i.e., whether
372            ML state should be saved and restored.
373         b. The second argument is a string argument specifying the name of
374            library and the C function.
375    
376         These things are currently not handled in the code generator, yet.
377    
378      4. In CProto,
379    
380         An encoding type of "bool" means "ml object" and is mapped into
381         C prototype of PTR.  Note that "bool" is different than "string",
382         even though "string" is also mapped into PTR, because "bool"
383         is assigned an CPS type of BOGt, while "string" is assigned INT32t.
384    
385      5. Pickler/unpicker
386    
387         Changed to handle RAW_RECORD and newest RAW_CCALL
388    
389      6. MLRiscGen,
390    
391         1. Changed to handle the new rawload/rawstore/rawrecord operators.
392         2. Code for handling C Calls has been moved to a new module CPSCCalls,
393            in the file CodeGen/cpscompile/cps-c-calls.sml
394    
395      7. Added the conditional move operator
396    
397             condmove of branch
398    
399         to cps.  Generation of this is still buggy so it is currently
400         disabled.
401    
402    ----------------------------------------------------------------------
403    Name: Lal George
404    Date: 2002/03/22 14:18:25 EST
405    Tag: george-20020322-cps-branch-prob
406    Description:
407    
408    Implemented the Ball-Larus branch prediction-heuristics, and
409    incorporated graphical viewers for control flow graphs.
410    
411    Ball-Larus Heuristics:
412    ---------------------
413    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
414    
415    By design it uses the Dempster-Shafer theory for combining
416    probabilities.  For example, in the function:
417    
418        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
419    
420    the ball-larus heuristics predicts that the n=0 is unlikely
421    (OH-heuristic), and the 'then' branch is unlikely because of the
422    RH-heuristic -- giving the 'then' branch an even lower combined
423    probability using the Dempster-Shafer theory.
424    
425    Finally, John Reppy's loop analysis in MLRISC, further lowers the
426    probability of the 'then' branch because of the loop in the else
427    branch.
428    
429    
430    Graphical Viewing:
431    ------------------
432    I merely plugged in Allen's graphical viewers into the compiler. The
433    additional code is not much. At the top level, saying:
434    
435            Control.MLRISC.getFlag "cfg-graphical-view" := true;
436    
437    will display the graphical view of the control flow graph just before
438    back-patching.  daVinci must be in your path for this to work. If
439    daVinci is not available, then the default viewer can be changed
440    using:
441    
442            Control.MLRISC.getString "viewer"
443    
444    which can be set to "dot" or "vcg" for the corresponding viewers. Of
445    course, these viewers must be in your path.
446    
447    The above will display the compilation unit at the level of clusters,
448    many of which are small, boring, and un-interesting. Also setting:
449    
450            Control.MLRISC.getInt "cfg-graphical-view_size"
451    
452    will display clusters that are larger than the value set by the above.
453    
454    
455    ----------------------------------------------------------------------
456    Name: Matthias Blume
457    Date: 2002/03/21 22:20:00 EST
458    Tag: blume-20020321-kmp-bugfix
459    Description:
460    
461    Changed the interface to the KMP routine in PreString and fixed
462    a minor bug in one place where it was used.
463    
464    ----------------------------------------------------------------------
465    Name: Allen Leung
466    Date: 2002/03/21 20:30:00 EST
467    Tag: leunga-20020321-cfg
468    Description:
469    
470      Fixed a potential problem in cfg edge splitting.
471    
472    ----------------------------------------------------------------------
473    Name: Allen Leung
474    Date: 2002/03/21 17:15:00 EST
475    Tag: leunga-20020321-x86-fp-cfg
476    Description:
477    
478      1. Recoded the buggy parts of x86-fp.
479    
480         a. All the block reordering code has been removed.
481            We now depend on the block placement phases to do this work.
482    
483         b. Critical edge splitting code has been simplified and moved into the
484            CFG modules, as where they belong.
485    
486         Both of these were quite buggy and complex.  The code is now much, much
487         simpler.
488    
489      2. X86 backend.
490    
491         a. Added instructions for 64-bit support.  Instruction selection for
492            64-bit has not been committed, however, since that
493            requires changes to MLTREE which haven't been approved by
494            Lal and John.
495    
496         b. Added support for FUCOMI and FUCOMIP when generating code for
497            PentiumPro and above.  We only generate these instructions in
498            the fast-fp mode.
499    
500         c. Added cases for JP and JNP in X86FreqProps.
501    
502      3. CFG
503    
504         CFG now has a bunch of methods for edge splitting and merging.
505    
506      4. Machine description.
507    
508         John's simplification of MLTREE_BASIS.fcond broke a few machine
509         description things:
510    
511         rtl-build.{sig,sml} and hppa.mdl fixed.
512    
513         NOTE: the machine description stuff in the repository is still broken.
514               Again, I can't put my fixes in because that involves
515               changes to MLTREE.
516    
517    ----------------------------------------------------------------------
518    Name: Matthias Blume
519    Date: 2002/03/20 15:55:00 EST
520    Tag: blume-20020320-kmp
521    Description:
522    
523    Implemented Knuth-Morris-Pratt string matching in PreString and used
524    it for String.isSubstring, Substring.isSubstring, and
525    Substring.position.
526    
527    (Might need some stress-testing.  Simple examples worked fine.)
528    
529    ----------------------------------------------------------------------
530    Name: Matthias Blume
531    Date: 2002/03/19 16:37:00 EST
532    Tag: blume-20020319-witnesses
533    Description:
534    
535    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
536    
537    This implements a generic mechanism for changing constness qualifiers
538    anywhere within big C types without resorting to outright "casts".
539    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
540    modify the constness at the outermost level.)
541    The implementation of "convert" is based on the idea of "witness"
542    values -- values that are not used by the operation but whose types
543    "testify" to their applicability.  On the implementation side, "convert"
544    is simply a projection (returning its second curried argument).  With
545    cross-module inlining, it should not result in any machine code being
546    generated.
547    
548    ----------------------------------------------------------------------
549    Name: Matthias Blume
550    Date: 2002/03/15 16:40:00 EST
551    Tag: blume-20020315-basis
552    Description:
553    
554    Provided (preliminary?) implementations for
555    
556      {String,Substring}.{concatWith,isSuffix,isSubstring}
557    
558    and
559    
560      Substring.full
561    
562    Those are in the Basis spec but they were missing in SML/NJ.
563    
564    ----------------------------------------------------------------------
565    Name: Matthias Blume
566    Date: 2002/03/14 21:30:00 EST
567    Tag: blume-20020314-controls
568    Description:
569    
570    Controls:
571    ---------
572    
573    1. Factored out the recently-added Controls : CONTROLS stuff and put
574       it into its own library $/controls-lib.cm.  The source tree for
575       this is under src/smlnj-lib/Controls.
576    
577    2. Changed the names of types and functions in this interface, so they
578       make a bit more "sense":
579    
580          module -> registry
581          'a registry -> 'a group
582    
583    3. The interface now deals in ref cells only.  The getter/setter interface
584       is (mostly) gone.
585    
586    4. Added a function that lets one register an already-existing ref cell.
587    
588    5. Made the corresponding modifications to the rest of the code so that
589       everything compiles again.
590    
591    6. Changed the implementation of Controls.MLRISC back to something closer
592       to the original.  In particular, this module (and therefore MLRISC)
593       does not depend on Controls.  There now is some link-time code in
594       int-sys.sml that registers the MLRISC controls with the Controls
595       module.
596    
597    CM:
598    ---
599    
600      * One can now specify the lambda-split aggressiveness in init.cmi.
601    
602    ----------------------------------------------------------------------
603    Name: Allen Leung
604    Date: 2002/03/13 17:30:00 EST
605    Tag: leunga-20020313-x86-fp-unary
606    Description:
607    
608    Bug fix for:
609    
610    > leunga@weaselbane:~/Yale/tmp/sml-dist{21} bin/sml
611    > Standard ML of New Jersey v110.39.1 [FLINT v1.5], March 08, 2002
612    > - fun f(x,(y,z)) = Real.~ y;
613    > [autoloading]
614    > [autoloading done]
615    >       fchsl   (%eax), 184(%esp)
616    > Error: MLRisc bug: X86MCEmitter.emitInstr
617    >
618    > uncaught exception Error
619    >   raised at: ../MLRISC/control/mlriscErrormsg.sml:16.14-16.19
620    
621    The problem was that the code generator did not generate any fp registers
622    in this case, and the ra didn't know that it needed to run the X86FP phase to
623    translate the pseudo fp instruction.   This only happened with unary fp
624    operators in certain situations.
625    
626    ----------------------------------------------------------------------
627    Name: Matthias Blume
628    Date: 2002/03/13 14:00:00 EST
629    Tag: blume-20020313-overload-etc
630    Description:
631    
632    1. Added _overload as a synonym for overload for backward compatibility.
633       (Control.overloadKW must be true for either version to be accepted.)
634    
635    2. Fixed bug in install script that caused more things to be installed
636       than what was requested in config/targets.
637    
638    3. Made CM aware of the (_)overload construct so that autoloading
639       works.
640    
641    ----------------------------------------------------------------------
642    Name: Matthias Blume
643    Date: 2002/03/12 22:03:00 EST
644    Tag: blume-20020312-url
645    Description:
646    
647    Forgot to update BOOT and srcarchiveurl.
648    
649    ----------------------------------------------------------------------
650    Name: Matthias Blume
651    Date: 2002/03/12 17:30:00 EST
652    Tag: blume-20020312-version110392
653    Description:
654    
655    Yet another version number bump (because of small changes to the
656    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
657    
658    Changes:
659    
660      The new pid generation scheme described a few weeks ago was overly
661      complicated.  I implemented a new mechanism that is simpler and
662      provides a bit more "stability":  Once CM has seen a compilation
663      unit, it keeps its identity constant (as long as you do not delete
664      those crucial CM/GUID/* files).  This means that when you change
665      an interface, compile, then go back to the old interface, and
666      compile again, you arrive at the original pid.
667    
668      There now also is a mechanism that instructs CM to use the plain
669      environment hash as a module's pid (effectively making its GUID
670      the empty string).  For this, "noguid" must be specified as an
671      option to the .sml file in question within its .cm file.
672      This is most useful for code that is being generated by tools such
673      as ml-nlffigen (because during development programmers tend to
674      erase the tool's entire output directory tree including CM's cached
675      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
676      revert to the old, broken behavior of SML/NJ, but in specific cases
677      where there is no danger of interface confusion, its use is ok
678      (I think).
679    
680      ml-nlffigen by default generates "noguid" annotations.  They can be
681      turned off by specifying -guid in its command line.
682    
683    ----------------------------------------------------------------------
684    Name: Lal George
685    Date: 2002/03/12 12 14:42:36 EST
686    Tag: george-20020312-frequency-computation
687    Description:
688    
689    Integrated jump chaining and static block frequency into the
690    compiler. More details and numbers later.
691    
692    ----------------------------------------------------------------------
693    Name: Lal George
694    Date: 2002/03/11 11 22:38:53 EST
695    Tag: george-20020311-jump-chain-elim
696    Description:
697    
698    Tested the jump chain elimination on all architectures (except the
699    hppa).  This is on by default right now and is profitable for the
700    alpha and x86, however, it may not be profitable for the sparc and ppc
701    when compiling the compiler.
702    
703    The gc test will typically jump to a label at the end of the cluster,
704    where there is another jump to an external cluster containing the actual
705    code to invoke gc. This is to allow factoring of common gc invocation
706    sequences. That is to say, we generate:
707    
708            f:
709               testgc
710               ja   L1      % jump if above to L1
711    
712            L1:
713               jmp L2
714    
715    
716    After jump chain elimination the 'ja L1' instructions is converted to
717    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
718    up being implemented in their long form (if L2 is far away) using:
719    
720            jbe     L3      % jump if below or equal to L3
721            jmp     L2
722         L3:
723            ...
724    
725    
726    For large compilation units L2  may be far away.
727    
728    
729    ----------------------------------------------------------------------
730    Name: Matthias Blume
731    Date: 2002/03/11 13:30:00 EST
732    Tag: blume-20020311-mltreeeval
733    Description:
734    
735    A functor parameter was missing.
736    
737    ----------------------------------------------------------------------
738    Name: Allen Leung
739    Date: 2002/03/11 10:30:00 EST
740    Tag: leunga-20020311-runtime-string0
741    Description:
742    
743       The representation of the empty string now points to a
744    legal null terminated C string instead of unit.  It is now possible
745    to convert an ML string into C string with InlineT.CharVector.getData.
746    This compiles into one single machine instruction.
747    
748    ----------------------------------------------------------------------
749    Name: Allen Leung
750    Date: 2002/03/10 23:55:00 EST
751    Tag: leunga-20020310-x86-call
752    Description:
753    
754       Added machine generation for CALL instruction (relative displacement mode)
755    
756    ----------------------------------------------------------------------
757    Name: Matthias Blume
758    Date: 2002/03/08 16:05:00
759    Tag: blume-20020308-entrypoints
760    Description:
761    
762    Version number bumped to 110.39.1.  NEW BOOTFILES!
763    
764    Entrypoints: non-zero offset into a code object where execution should begin.
765    
766    - Added the notion of an entrypoint to CodeObj.
767    - Added reading/writing of entrypoint info to Binfile.
768    - Made runtime system bootloader aware of entrypoints.
769    - Use the address of the label of the first function given to mlriscGen
770      as the entrypoint.  This address is currently always 0, but it will
771      not be 0 once we turn on block placement.
772    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
773      with entry points) from mlriscGen.
774    
775    ----------------------------------------------------------------------
776    Name: Allen Leung
777    Date: 2002/03/07 20:45:00 EST
778    Tag: leunga-20020307-x86-cmov
779    Description:
780    
781       Bug fixes for CMOVcc on x86.
782    
783       1. Added machine code generation for CMOVcc
784       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
785       3. CMOVcc cannot have an immediate operand as argument.
786    
787    ----------------------------------------------------------------------
788    Name: Matthias Blume
789    Date: 2002/03/07 16:15:00 EST
790    Tag: blume-20020307-controls
791    Description:
792    
793    This is a very large but mostly boring patch which makes (almost)
794    every tuneable compiler knob (i.e., pretty much everything under
795    Control.* plus a few other things) configurable via both the command
796    line and environment variables in the style CM did its configuration
797    until now.
798    
799    Try starting sml with '-h' (or, if you are brave, '-H')
800    
801    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
802    implements the underlying generic mechanism.
803    
804    The interface to some of the existing such facilities has changed somewhat.
805    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
806    (The getFoo interface is still there for backward-compatibility, but its
807    use is deprecated.)
808    
809    The ml-build script passes -Cxxx=yyy command-line arguments through so
810    that one can now twiddle the compiler settings when using this "batch"
811    compiler.
812    
813    TODO items:
814    
815    We should go through and throw out all controls that are no longer
816    connected to anything.  Moreover, we should go through and provide
817    meaningful (and correct!) documentation strings for those controls
818    that still are connected.
819    
820    Currently, multiple calls to Controls.new are accepted (only the first
821    has any effect).  Eventually we should make sure that every control
822    is being made (via Controls.new) exactly once.  Future access can then
823    be done using Controls.acc.
824    
825    Finally, it would probably be a good idea to use the getter-setter
826    interface to controls rather than ref cells.  For the time being, both
827    styles are provided by the Controls module, but getter-setter pairs are
828    better if thread-safety is of any concern because they can be wrapped.
829    
830    *****************************************
831    
832    One bug fix: The function blockPlacement in three of the MLRISC
833    backpatch files used to be hard-wired to one of two possibilities at
834    link time (according to the value of the placementFlag).  But (I
835    think) it should rather sense the flag every time.
836    
837    *****************************************
838    
839    Other assorted changes (by other people who did not supply a HISTORY entry):
840    
841    1. the cross-module inliner now works much better (Monnier)
842    2. representation of weights, frequencies, and probabilities in MLRISC
843       changed in preparation of using those for weighted block placement
844       (Reppy, George)
845    
846    ----------------------------------------------------------------------
847    Name: Lal George
848    Date: 2002/03/07 14:44:24 EST 2002
849    Tag: george-20020307-weighted-block-placement
850    
851    Tested the weighted block placement optimization on all architectures
852    (except the hppa) using AMPL to generate the block and edge frequencies.
853    Changes were required in the machine properties to correctly
854    categorize trap instructions. There is an MLRISC flag
855    "weighted-block-placement" that can be used to enable weighted block
856    placement, but this will be ineffective without block/edge
857    frequencies (coming soon).
858    
859    
860    ----------------------------------------------------------------------
861    Name: Lal George
862    Date: 2002/03/05 17:24:48 EST
863    Tag: george-20020305-linkage-cluster
864    
865    In order to support the block placement optimization, a new cluster
866    is generated as the very first cluster (called the linkage cluster).
867    It contains a single jump to the 'real' entry point for the compilation
868    unit. Block placement has no effect on the linkage cluster itself, but
869    all the other clusters  have full freedom in the manner in which they
870    reorder blocks or functions.
871    
872    On the x86 the typical linkage code that is generated is:
873       ----------------------
874            .align 2
875       L0:
876            addl    $L1-L0, 72(%esp)
877            jmp     L1
878    
879    
880            .align  2
881       L1:
882       ----------------------
883    
884    72(%esp) is the memory location for the stdlink register. This
885    must contain the address of the CPS function being called. In the
886    above example, it contains the address of  L0; before
887    calling L1 (the real entry point for the compilation unit), it
888    must contain the address for L1, and hence
889    
890            addl $L1-L0, 72(%esp)
891    
892    I have tested this on all architectures except the hppa.The increase
893    in code size is of course negligible
894    
895    ----------------------------------------------------------------------
896    Name: Allen Leung
897    Date: 2002/03/03 13:20:00 EST
898    Tag: leunga-20020303-mlrisc-tools
899    
900      Added #[ ... ] expressions to mlrisc tools
901    
902    ----------------------------------------------------------------------
903    Name: Matthias Blume
904    Date: 2002/02/27 12:29:00 EST
905    Tag: blume-20020227-cdebug
906    Description:
907    
908    - made types in structure C and C_Debug to be equal
909    - got rid of code duplication (c-int.sml vs. c-int-debug.sml)
910    - there no longer is a C_Int_Debug (C_Debug is directly derived from C)
911    
912    ----------------------------------------------------------------------
913    Name: Matthias Blume
914    Date: 2002/02/26 12:00:00 EST
915    Tag: blume-20020226-ffi
916    Description:
917    
918    1. Fixed a minor bug in CM's "noweb" tool:
919       If numbering is turned off, then truly don't number (i.e., do not
920       supply the -L option to noweb).  The previous behavior was to supply
921       -L'' -- which caused noweb to use the "default" line numbering scheme.
922       Thanks to Chris Richards for pointing this out (and supplying the fix).
923    
924    2. Once again, I reworked some aspects of the FFI:
925    
926       A. The incomplete/complete type business:
927    
928       - Signatures POINTER_TO_INCOMPLETE_TYPE and accompanying functors are
929         gone!
930       - ML types representing an incomplete type are now *equal* to
931         ML types representing their corresponding complete types (just like
932         in C).  This is still safe because ml-nlffigen will not generate
933         RTTI for incomplete types, nor will it generate functions that
934         require access to such RTTI.   But when ML code generated from both
935         incomplete and complete versions of the C type meet, the ML types
936         are trivially interoperable.
937    
938         NOTE:  These changes restore the full generality of the translation
939         (which was previously lost when I eliminated functorization)!
940    
941       B. Enum types:
942    
943       - Structure C now has a type constructor "enum" that is similar to
944         how the "su" constructor works.  However, "enum" is not a phantom
945         type because each "T enum" has values (and is isomorphic to
946         MLRep.Signed.int).
947       - There are generic access operations for enum objects (using
948         MLRep.Signed.int).
949       - ml-nlffigen will generate a structure E_foo for each "enum foo".
950         * The structure contains the definition of type "mlrep" (the ML-side
951         representation type of the enum).  Normally, mlrep is the same
952         as "MLRep.Signed.int", but if ml-nlffigen was invoked with "-ec",
953         then mlrep will be defined as a datatype -- thus facilitating
954         pattern matching on mlrep values.
955         ("-ec" will be suppressed if there are duplicate values in an
956          enumeration.)
957         * Constructors ("-ec") or values (no "-ec") e_xxx of type mlrep
958         will be generated for each C enum constant xxx.
959         * Conversion functions m2i and i2m convert between mlrep and
960         MLRep.Signed.int.  (Without "-ec", these functions are identities.)
961         * Coversion functions c and ml convert between mlrep and "tag enum".
962         * Access functions (get/set) fetch and store mlrep values.
963       - By default (unless ml-nlffigen was invoked with "-nocollect"), unnamed
964         enumerations are merged into one single enumeration represented by
965         structure E_'.
966    
967    ----------------------------------------------------------------------
968    Name: Allen Leung
969    Date: 2002/02/25 04:45:00 EST
970    Tag: leunga-20020225-cps-spill
971    
972    This is a new implementation of the CPS spill phase.
973    The new phase is in the new file compiler/CodeGen/cpscompile/spill-new.sml
974    In case of problems, replace it with the old file spill.sml
975    
976    The current compiler runs into some serious performance problems when
977    constructing a large record.  This can happen when we try to compile a
978    structure with many items.  Even a very simple structure like the following
979    makes the compiler slow down.
980    
981        structure Foo = struct
982           val x_1 = 0w1 : Word32.int
983           val x_2 = 0w2 : Word32.int
984           val x_3 = 0w3 : Word32.int
985           ...
986           val x_N = 0wN : Word32.int
987        end
988    
989    The following table shows the compile time, from N=1000 to N=4000,
990    with the old compiler:
991    
992    N
993    1000   CPS 100 spill                           0.04u  0.00s  0.00g
994           MLRISC ra                               0.06u  0.00s  0.05g
995              (spills = 0 reloads = 0)
996           TOTAL                                   0.63u  0.07s  0.21g
997    
998    1100   CPS 100 spill                           8.25u  0.32s  0.64g
999           MLRISC ra                               5.68u  0.59s  3.93g
1000              (spills = 0 reloads = 0)
1001           TOTAL                                   14.71u  0.99s  4.81g
1002    
1003    1500   CPS 100 spill                           58.55u  2.34s  1.74g
1004           MLRISC ra                               5.54u  0.65s  3.91g
1005              (spills = 543 reloads = 1082)
1006           TOTAL                                   65.40u  3.13s  6.00g
1007    
1008    2000   CPS 100 spill                           126.69u  4.84s  3.08g
1009           MLRISC ra                               0.80u  0.10s  0.55g
1010              (spills = 42 reloads = 84)
1011           TOTAL                                   129.42u  5.10s  4.13g
1012    
1013    3000   CPS 100 spill                           675.59u  19.03s  11.64g
1014           MLRISC ra                               2.69u  0.27s  1.38g
1015              (spills = 62 reloads = 124)
1016           TOTAL                                   682.48u  19.61s  13.99g
1017    
1018    4000   CPS 100 spill                           2362.82u  56.28s  43.60g
1019           MLRISC ra                               4.96u  0.27s  2.72g
1020              (spills = 85 reloads = 170)
1021           TOTAL                                   2375.26u  57.21s  48.00g
1022    
1023    As you can see the old cps spill module suffers from some serious
1024    performance problem.  But since I cannot decipher the old code fully,
1025    instead of patching the problems up, I'm reimplementing it
1026    with a different algorithm.  The new code is more modular,
1027    smaller when compiled, and substantially faster
1028    (O(n log n) time and O(n) space).  Timing of the new spill module:
1029    
1030    4000  CPS 100 spill                           0.02u  0.00s  0.00g
1031          MLRISC ra                               0.25u  0.02s  0.15g
1032             (spills=1 reloads=3)
1033          TOTAL                                   7.74u  0.34s  1.62g
1034    
1035    Implementation details:
1036    
1037    As far as I can tell, the purpose of the CPS spill module is to make sure the
1038    number of live variables at any program point (the bandwidth)
1039    does not exceed a certain limit, which is determined by the
1040    size of the spill area.
1041    
1042    When the bandwidth is too large, we decrease the register pressure by
1043    packing live variables into spill records.  How we achieve this is
1044    completely different than what we did in the old code.
1045    
1046    First, there is something about the MLRiscGen code generator
1047    that we should be aware of:
1048    
1049    o MLRiscGen performs code motion!
1050    
1051       In particular, it will move floating point computations and
1052       address computations involving only the heap pointer to
1053       their use sites (if there is only a single use).
1054       What this means is that if we have a CPS record construction
1055       statement
1056    
1057           RECORD(k,vl,w,e)
1058    
1059       we should never count the new record address w as live if w
1060       has only one use (which is often the case).
1061    
1062       We should do something similar to floating point, but the transformation
1063       there is much more complex, so I won't deal with that.
1064    
1065    Secondly, there are now two new cps primops at our disposal:
1066    
1067     1. rawrecord of record_kind option
1068        This pure operator allocates some uninitialized storage from the heap.
1069        There are two forms:
1070    
1071         rawrecord NONE [INT n]  allocates a tagless record of length n
1072         rawrecord (SOME rk) [INT n] allocates a tagged record of length n
1073                                     and initializes the tag.
1074    
1075     2. rawupdate of cty
1076          rawupdate cty (v,i,x)
1077          Assigns to x to the ith component of record v.
1078          The storelist is not updated.
1079    
1080    We use these new primops for both spilling and increment record construction.
1081    
1082     1. Spilling.
1083    
1084        This is implemented with a linear scan algorithm (but generalized
1085        to trees).  The algorithm will create a single spill record at the
1086        beginning of the cps function and use rawupdate to spill to it,
1087        and SELECT or SELp to reload from it.  So both spills and reloads
1088        are fine-grain operations.  In contrast, in the old algorithm
1089        "spills" have to be bundled together in records.
1090    
1091        Ideally, we should sink the spill record construction to where
1092        it is needed.  We can even split the spill record into multiple ones
1093        at the places where they are needed.  But CPS is not a good
1094        representation for global code motion, so I'll keep it simple and
1095        am not attempting this.
1096    
1097     2. Incremental record construction (aka record splitting).
1098    
1099        Long records with many component values which are simulatenously live
1100        (recall that single use record addresses are not considered to
1101         be live) are constructed with rawrecord and rawupdate.
1102        We allocate space on the heap with rawrecord first, then gradually
1103        fill it in with rawupdate.  This is the technique suggested to me
1104        by Matthias.
1105    
1106        Some restrictions on when this is applicable:
1107        1. It is not a VECTOR record.  The code generator currently does not handle
1108           this case. VECTOR record uses double indirection like arrays.
1109        2. All the record component values are defined in the same "basic block"
1110           as the record constructor.  This is to prevent speculative
1111           record construction.
1112    
1113    ----------------------------------------------------------------------
1114    Name: Allen Leung
1115    Date: 2002/02/22 01:02:00 EST
1116    Tag: leunga-20020222-mlrisc-tools
1117    
1118    Minor bug fixes in the parser and rewriter
1119    
1120    ----------------------------------------------------------------------
1121    Name: Allen Leung
1122    Date: 2002/02/21 20:20:00 EST
1123    Tag: leunga-20020221-peephole
1124    
1125    Regenerated the peephole files.  Some contained typos in the specification
1126    and some didn't compile because of pretty printing bugs in the old version
1127    of 'nowhere'.
1128    
1129    ----------------------------------------------------------------------
1130    Name: Allen Leung
1131    Date: 2002/02/19 20:20:00 EST
1132    Tag: leunga-20020219-mlrisc-tools
1133    Description:
1134    
1135       Minor bug fixes to the mlrisc-tools library:
1136    
1137       1.  Fixed up parsing colon suffixed keywords
1138       2.  Added the ability to shut the error messages up
1139       3.  Reimplemented the pretty printer and fixed up/improved
1140           the pretty printing of handle and -> types.
1141       4.  Fixed up generation of literal symbols in the nowhere tool.
1142       5.  Added some SML keywords to to sml.sty
1143    
1144    ----------------------------------------------------------------------
1145    Name: Matthias Blume
1146    Date: 2002/02/19 16:20:00 EST
1147    Tag: blume-20020219-cmffi
1148    Description:
1149    
1150    A wild mix of changes, some minor, some major:
1151    
1152    * All C FFI-related libraries are now anchored under $c:
1153        $/c.cm      --> $c/c.cm
1154        $/c-int.cm  --> $c/internals/c-int.cm
1155        $/memory.cm --> $c/memory/memory.cm
1156    
1157    * "make" tool (in CM) now treats its argument pathname slightly
1158      differently:
1159        1. If the native expansion is an absolute name, then before invoking
1160           the "make" command on it, CM will apply OS.Path.mkRelative
1161           (with relativeTo = OS.FileSys.getDir()) to it.
1162        2. The argument will be passed through to subsequent phases of CM
1163           processing without "going native".  In particular, if the argument
1164           was an anchored path, then "make" will not lose track of that anchor.
1165    
1166    * Compiler backends now "know" their respective C calling conventions
1167      instead of having to be told about it by ml-nlffigen.  This relieves
1168      ml-nlffigen from one of its burdens.
1169    
1170    * The X86Backend has been split into X86CCallBackend and X86StdCallBackend.
1171    
1172    * Export C_DEBUG and C_Debug from $c/c.cm.
1173    
1174    * C type encoding in ml-nlffi-lib has been improved to model the conceptual
1175      subtyping relationship between incomplete pointers and their complete
1176      counterparts.  For this, ('t, 'c) ptr has been changed to 'o ptr --
1177      with the convention of instantiating 'o with ('t, 'c) obj whenever
1178      the pointer target type is complete.  In the incomplete case, 'o
1179      will be instantiated with some "'c iobj" -- a type obtained by
1180      using one of the functors PointerToIncompleteType or PointerToCompleteType.
1181    
1182      Operations that work on both incomplete and complete pointer types are
1183      typed as taking an 'o ptr while operations that require the target to
1184      be known are typed as taking some ('t, 'c) obj ptr.
1185    
1186      voidptr is now a bit "more concrete", namely "type voidptr = void ptr'"
1187      where void is an eqtype without any values.  This makes it possible
1188      to work on voidptr values using functions meant to operate on light
1189      incomplete pointers.
1190    
1191    * As a result of the above, signature POINTER_TO_INCOMPLETE_TYPE has
1192      been vastly simplified.
1193    
1194    ----------------------------------------------------------------------
1195    Name: Matthias Blume
1196    Date: 2002/02/19 10:48:00 EST
1197    Tag: blume-20020219-pqfix
1198    Description:
1199    
1200    Applied Chris Okasaki's bug fix for priority queues.
1201    
1202    ----------------------------------------------------------------------
1203    Name: Matthias Blume
1204    Date: 2002/02/15 17:05:00
1205  Tag: Release_110_39  Tag: Release_110_39
1206  Description:  Description:
1207    
1208    Last-minute retagging is becoming a tradition... :-(
1209    
1210    This is the working release 110.39.
1211    
1212    ----------------------------------------------------------------------
1213    Name: Matthias Blume
1214    Date: 2002/02/15 16:00:00 EST
1215    Tag: Release_110_39-orig
1216    Description:
1217    
1218  Working release 110.39.  New bootfiles.  Working release 110.39.  New bootfiles.
1219    
1220    (Update: There was a small bug in the installer so it wouldn't work
1221    with all shells.  So I retagged. -Matthias)
1222    
1223  ----------------------------------------------------------------------  ----------------------------------------------------------------------
1224  Name: Matthias Blume  Name: Matthias Blume
1225  Date: 2002/02/15 14:17:00 EST  Date: 2002/02/15 14:17:00 EST

Legend:
Removed from v.1069  
changed lines
  Added in v.1208

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0