Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1080, Wed Feb 20 01:28:00 2002 UTC revision 1282, Thu Oct 17 13:07:53 2002 UTC
# Line 13  Line 13 
13  Description:  Description:
14    
15  ----------------------------------------------------------------------  ----------------------------------------------------------------------
16    Name: Matthias Blume
17    Date: 2002/10/17 09:10:00 EDT
18    Tag: Release_110_42
19    Description:
20    
21    In good old tradition, there has been a slight hiccup so that we have
22    to patch 110.42 after the fact.   The old release tag has been replaced
23    (see below).
24    
25    The change solves a problem with two competing approaches the
26    configuration problem regarding MacOS 10.1 vs. MacOS 10.2 which got in
27    each other's way.
28    
29    This change only affects the runtime system code and the installer script.
30    (No new bootfiles.)
31    
32    ----------------------------------------------------------------------
33    Name: Matthias Blume
34    Date: 2002/10/16 12:00:00 EDT
35    Tag: Release_110_42_removed
36    Description:
37    
38    New working release.  New bootfiles.
39    
40    ----------------------------------------------------------------------
41    Name: Matthias Blume
42    Date: 2002/10/10 13:10:00 EDT
43    Tag: blume-20021010-ppc-divs
44    Description:
45    
46    The mltree operator DIVS must be implemented with an overflow check on
47    the PPC because the hardware indicates divide-by-zero using "overflow" as
48    well.
49    
50    ----------------------------------------------------------------------
51    Name: Matthias Blume
52    Date: 2002/07/23 11:45:00
53    Tag: blume-20020723-smlnj-home
54    Description:
55    
56    Sml now senses the SMLNJ_HOME environment variable.  If this is set,
57    then the bin dir is assumed to be in $SMLNJ_HOME/bin and (unless
58    CM_PATHCONFIG is also set), the path configuration file is assumed
59    to be in $SMLNJ_HOME/lib/pathconfig.  This way one can easily move
60    the entire tree to some other place and everything will "just work".
61    
62    (Companion commands such as ml-build and ml-makedepend also sense this
63    variable.)
64    
65    ----------------------------------------------------------------------
66    Name: Matthias Blume
67    Date: 2002/07/12 21:19:00 EDT
68    Tag: blume-20020712-liveness
69    Description:
70    
71    Exported two useful "step" functions from liveness module (MLRISC).
72    
73    ----------------------------------------------------------------------
74    Name: Matthias Blume
75    Date: 2002/07/05 16:00 EDT
76    Tag: Release_110_41
77    Description:
78    
79    New working release.  New bootfiles.
80    
81    ----------------------------------------------------------------------
82    Name: Matthias Blume
83    Date: 2002/07/05 10:25:00 EDT
84    Tag: blume-20020705-btimp
85    Description:
86    
87    Exported structure BTImp from $smlnj/viscomp/debugprof.cm so that
88    other clients can set up backtracing support.
89    
90    ----------------------------------------------------------------------
91    Name: Matthias Blume
92    Date: 2002/06/25 17:23:00 EDT
93    Tag: blume-20020625-fpmax
94    Description:
95    
96    Fixed a bug in translation of INLMAX (and INLMIN) for the floating-point
97    case.  (The sense of the isNaN test was reversed -- which made min and
98    max always return their first argument.)
99    
100    ----------------------------------------------------------------------
101    Name: Matthias Blume
102    Date: 2002/06/11
103    Tag: blume-20020611-unixpath
104    Description:
105    
106    Back-ported OS.Path.{from,to}UnixPath from idlbasis-devel branch.
107    
108    ----------------------------------------------------------------------
109    Name: Matthias Blume
110    Date: 2002/06/10 16:35:00 EDT
111    Tag: blume-20020610-ieeereal
112    Description:
113    
114    I back-ported my implementation of IEEEReal.fromString from the
115    idlbasis-devel branch so that we can test it.
116    
117    Another small change is that ppDec tries to give more information
118    than just "<sig>" in the case of functors.  However, this code is
119    broken in some mysterious way if the functor's body's signature
120    has not been declared by ascription but gets inferred from the
121    implementation.  This needs fixing...
122    
123    ----------------------------------------------------------------------
124    Name: Matthias Blume
125    Date: 2002/05/31
126    Tag: blume-20020531-btrace-mode
127    Description:
128    
129    Resurrected SMLofNJ.Internals.BTrace.mode.  (It accidentally fell by
130    the wayside when I switched over to using Controls everywhere.)
131    
132    ----------------------------------------------------------------------
133    Name: Lal George
134    Date: 2002/05/23 12:21:40 EDT
135    Tag: george-20020523-visual-labels
136    Description:
137    
138    Labels are now displayed in the graphical output to make
139    the fall-through and target blocks obvious.
140    
141    ----------------------------------------------------------------------
142    Name: Matthias Blume
143    Date: 2002/05/22 11:03:00 EDT
144    Tag: blume-20020522-shrink
145    Description:
146    
147    John tweaked yesterday's fix for 1131 to handle an out-of-memory
148    situation that comes up when allocating huge arrays.
149    
150    ----------------------------------------------------------------------
151    Name: Matthias Blume
152    Date: 2002/05/21 16:00:00 EDT
153    Tag: Release_110_40
154    Description:
155    
156    New working release (110.40).  New bootfiles.
157    
158    [Also: John Reppy fixed GC bug 1131.]
159    
160    ----------------------------------------------------------------------
161    Name: Matthias Blume
162    Date: 2002/05/21 12:35:00 EDT
163    Tag: blume-20020521-cmdoc
164    Description:
165    
166    CM documentation update.
167    
168    ----------------------------------------------------------------------
169    Name: Matthias Blume
170    Date: 2002/05/21 10:55:00 EDT
171    Tag: blume-20020521-misc
172    Description:
173    
174    - John tweaked runtime to be silent on heap export (except when
175      GC messages are on).
176    - I added a few more things (cross-compiling versions of CMB) to
177      config/preloads (as suggestions).
178    
179    ----------------------------------------------------------------------
180    Name: Matthias Blume
181    Date: 2002/05/20 22:25:00 EDT
182    Tag: blume-20020520-controls
183    Description:
184    
185    - Added ControlUtil structure to control-lib.cm.
186    - Use it throughout.
187    - Used Controls facility to define MLRISC controls (as opposed to
188      registering MLRISC control ref cells with Controls after the fact)
189    - Fixed messed-up controls priorities.
190    
191    * Removed again all the stuff from config/preloads that one wouldn't
192      be able to preload at the time the initial heap image is built.
193      (Many libraries, e.g., CML, do not exist yet at this time.  The
194       only libraries that can be preloaded via config/preloads are those
195       that come bundled with the bootfiles.)
196    
197    ----------------------------------------------------------------------
198    Name: Matthias Blume
199    Date: 2002/05/20 10:59:00 EDT
200    Tag: blume-20020520-preloads
201    Description:
202    
203    Added a lot of commented-out suggestions for things to be included
204    in config/preloads.
205    
206    ----------------------------------------------------------------------
207    Name: Allen Leung
208    Date: 2002/05/18 14:20:00 EDT
209    Tag: leunga-20020518-mdl
210    Description:
211    
212    o Made the mdl tool stuff compile and run again.
213    o I've disabled all the stuff that depends on RTL specifications; they
214      are all badly broken anyway.
215    
216    ----------------------------------------------------------------------
217    Name: Matthias Blume
218    Date: 2002/05/17 16:49:00 EDT
219    Tag: blume-20020517-controls
220    Description:
221    
222    0. John Reppy made several modifications to the SML/NJ library.
223       In particular, there is a shiny new controls-lib.cm.
224    
225    1. Pushed new controls interface through compiler so that everything
226       compiles again.
227    
228    2. Added FormatComb and FORMAT_COMB to the CML version of the
229       SML/NJ library (so that CML compiles again).
230    
231    3. Modified init scripts because XXX_DEFAULT environment variables
232       are no longer with us.  (Boot-time initialization is now done
233       using the same environment variables that are also used for
234       startup-time initialization of controls.)
235    
236    ----------------------------------------------------------------------
237    Name: Lal George
238    Date: 2002/05/15 09:20:10 EDT
239    Tag: george-20020515-pseudo-op-decls
240    Description:
241    
242            All pseudo-ops emitted before the first segment declaration
243    such as TEXT, DATA, and BSS directives are assumed to be global
244    declarations and are emitted first in the assembly file. This is
245    useful in a number of situations where one has pseudo-ops that are not
246    specific to any segment, and also works around the constraint that one
247    cannot have client pseudo-ops in the TEXT segment.
248    
249            Because no segment is associated with these declarations it is
250    an error to allocate any space or objects before the first segment
251    directive and an exception will be raised. However, we cannot make
252    this check for client pseudo-ops.
253    
254            These top level declarations are a field in the CFG graph_info.
255    In theory you can continue to add to this field after the CFG has been
256    built -- provided you know what you are doing;-)
257    
258    ----------------------------------------------------------------------
259    Name: Matthias Blume
260    Date: 2002/05/13 16:40:00 EDT
261    Tag: blume-20020513-pp-etc
262    Description:
263    
264    A few minor bugfixes:
265    
266      - Stopgap measure for bug recently reported by Elsa Gunter (ppDec).
267        (Bogus printouts for redefined bindings still occur.  Compiler
268        bug should no longer occur now.  We need to redo the prettyprinter
269        from scratch.)
270    
271      - CM pathname printer now also adds escape sequences for ( and )
272    
273      - commend and docu fixes for ml-nlffi
274    
275    ----------------------------------------------------------------------
276    Name: Matthias Blume
277    Date: 2002/05/10 16:40:00 EDT
278    Tag: blume-20020510-erg-textio
279    Description:
280    
281    Applied the following bugfix provided by Emden Gansner:
282    
283        Output is corrupted when outputSubstr is used rather than output.
284    
285        The problem occurs when a substring
286    
287            ss = (s, dataStart, dataLen)
288    
289        where dataStart > 0, fills a stream buffer with avail bytes left.
290        avail bytes of s, starting at index dataStart, are copied into the
291        buffer, the buffer is flushed, and then the remaining dataLen-avail
292        bytes of ss are copied into the beginning of the buffer. Instead of
293        starting this copy at index dataStart+avail in s, the current code
294        starts the copy at index avail.
295    
296        Fix:
297        In text-io-fn.sml, change line 695 from
298             val needsFlush = copyVec(v, avail, dataLen-avail, buf, 0)
299        to
300             val needsFlush = copyVec(v, dataStart+avail, dataLen-avail, buf, 0)
301    
302    ----------------------------------------------------------------------
303    Name: Matthias Blume
304    Date: 2002/04/12 13:55:00 EDT
305    Tag: blume-20020412-assyntax
306    Description:
307    
308    1. Grabbed newer assyntax.h from the XFree86 project.
309    2. Fiddled with how to compile X86.prim.asm without warnings.
310    3. (Very) Minor cleanup in CM.
311    
312    ----------------------------------------------------------------------
313    Name: Matthias Blume
314    Date: 2002/04/01 (no joke!) 17:07:00 EST
315    Tag: blume-20020401-x86div
316    Description:
317    
318    Added full support for div/mod/rem/quot on the x86, using the machine
319    instruction's two results (without clumsily recomputing the remainder)
320    directly where appropriate.
321    
322    Some more extensive power-of-two support was added to the x86 instruction
323    selector (avoiding expensive divs, mods, and muls where they can be
324    replaced with cheaper shifts and masks).  However, this sort of thing
325    ought to be done earlier, e.g., within the CPS optimizer so that
326    all architectures benefit from it.
327    
328    The compiler compiles to a fixed point, but changes might be somewhat
329    fragile nevertheless.  Please, report any strange things that you might
330    see wrt. div/mod/quot/rem...
331    
332    ----------------------------------------------------------------------
333    Name: Matthias Blume
334    Date: 2002/03/29 17:22:00
335    Tag: blume-20020329-div
336    Description:
337    
338    Fixed my broken div/mod logic.  Unfortunately, this means that the
339    inline code for div/mod now has one more comparison than before.
340    Fast paths (quotient > 0 or remainder = 0) are not affected, though.
341    The problem was with quotient = 0, because that alone does not tell
342    us which way the rounding went.  One then has to look at whether
343    remainder and divisor have the same sign...  :(
344    
345    Anyway, I replaced the bootfiles with fresh ones...
346    
347    ----------------------------------------------------------------------
348    Name: Matthias Blume
349    Date: 2002/03/29 14:10:00 EST
350    Tag: blume-20020329-inlprims
351    Description:
352    
353    NEW BOOTFILES!!!    Version number bumped to 110.39.3.
354    
355    Primops have changed. This means that the bin/boot-file formats have
356    changed as well.
357    
358    To make sure that there is no confusion, I made a new version.
359    
360    
361    CHANGES:
362    
363    * removed REMT from mltree (remainder should never overflow).
364    
365    * added primops to deal with divisions of all flavors to the frontend
366    
367    * handled these primops all the way through so they map to their respective
368      MLRISC support
369    
370    * used these primops in the implementation of Int, Int32, Word, Word32
371    
372    * removed INLDIV, INLMOD, and INLREM as they are no longer necessary
373    
374    * parameterized INLMIN, INLMAX, and INLABS by a numkind
375    
376    * translate.sml now deals with all flavors of INL{MIN,MAX,ABS}, including
377      floating point
378    
379    * used INL{MIN,MAX,ABS} in the implementation of Int, Int32, Word, Word32,
380      and Real (but Real.abs maps to a separate floating-point-only primop)
381    
382    
383    TODO items:
384    
385    * Hacked Alpha32 instruction selection, disabling the selection of REMx
386      instructions because the machine instruction encoder cannot handle
387      them.  (Hppa, PPC, and Sparc instruction selection did not handle
388      REM in the first place, and REM is supported by the x86 machine coder.)
389    
390    * Handle DIV and MOD with DIV_TO_NEGINF directly in the x86 instruction
391      selection phase.  (The two can be streamlined because the hardware
392      delivers both quotient and remainder at the same time anyway.)
393    
394    * Think about what to do with "valOf(Int32.minInt) div ~1" and friends.
395      (Currently the behavior is inconsistent both across architectures and
396      wrt. the draft Basis spec.)
397    
398    * Word8 should eventually be handled natively, too.
399    
400    * There seems to be one serious bug in mltree-gen.sml.  It appears, though,
401      as if there currently is no execution path that could trigger it in
402      SML/NJ.  (The assumptions underlying functions arith and promotable do not
403      hold for things like multiplication and division.)
404    
405    ----------------------------------------------------------------------
406    Name: Matthias Blume
407    Date: 2002/03/27 16:27:00 EST
408    Tag: blume-20020327-mlrisc-divisions
409    Description:
410    
411    Added support for all four division operations (ML's div, mod, quot,
412    and rem) to MLRISC.  In the course of doing so, I also rationalized
413    the naming (no more annoying switch-around of DIV and QUOT), by
414    parameterizing the operation by div_rounding_mode (which can be either
415    DIV_TO_ZERO or DIV_TO_NEGINF).
416    
417    The generic MLTreeGen functor takes care of compiling all four
418    operations down to only round-to-zero div.
419    
420    Missing pieces:
421    
422      * Doing something smarter than relying on MLTreeGen on architectures
423        like, e.g., the x86 where hardware division delivers both quotient and
424        remainder at the same time.  With this, the implementation of the
425        round-to-neginf operations could be further streamlined.
426    
427      * Remove inlining support for div/mod/rem from the frontend and replace it
428        with primops that get carried through to the backend.  Do this for all
429        int and word types.
430    
431    ----------------------------------------------------------------------
432    Name: Matthias Blume
433    Date: 2002/03/25 17:25:00 EST
434    Tag: blume-20020325-divmod
435    Description:
436    
437    I improved (hopefully without breaking them) the implementation of Int.div,
438    Int.mod, and Int.rem.   For this, the code in translate.sml now takes
439    advantage of the following observations:
440    
441      Let  q = x quot y      r = x rem y
442           d = x div  y      m = x mod y
443    
444    where "quot" is the round-to-zero version of integer division that
445    hardware usually provides.  Then we have:
446    
447         r = x - q * y        where neither the * nor the - will overflow
448         d = if q >= 0 orelse x = q * y then q else q - 1
449                              where neither the * nor the - will overflow
450         m = if q >= 0 orelse r = 0 then r else r + y
451                              where the + will not overflow
452    
453    This results in substantial simplification of the generated code.
454    The following table shows the number of CFG nodes and edges generated
455    for
456            fun f (x, y) = x OPER y
457            (* with OPER \in div, mod, quot, rem *)
458    
459    
460        OPER | nodes(old) | edges(old) | nodes(new) | edges(new)
461        --------------------------------------------------------
462         div |         24 |         39 |         12 |         16
463         mod |         41 |         71 |         12 |         16
464        quot |          8 |         10 |          8 |         10
465         rem |         10 |         14 |          8 |         10
466    
467    
468    ----------------------------------------------------------------------
469    Name: Matthias Blume
470    Date: 2002/03/25 22:06:00 EST
471    Tag: blume-20020325-cprotobug
472    Description:
473    
474    Fixed a bug in cproto (c prototype decoder).
475    
476    ----------------------------------------------------------------------
477    Name: Matthias Blume
478    Date: 2002/03/25 16:00:00 EST
479    Tag: blume-20020325-raw-primops
480    Description:
481    
482    I did some cleanup to Allen's new primop code and
483    replaced yesterday's bootfiles with new ones.
484    (But they are stored in the same place.)
485    
486    ----------------------------------------------------------------------
487    Name: Matthias Blume
488    Date: 2002/03/24 22:40:00 EST
489    Tag: blume-20020324-bootfiles
490    Description:
491    
492    Made the bootfiles that Allen asked for.
493    
494    ----------------------------------------------------------------------
495    Name: Allen Leung
496    Date: 2002/03/23 15:50:00 EST
497    Tag: leunga-20020323-flint-cps-rcc-primops
498    Description:
499    
500      1. Changes to FLINT primops:
501    
502        (* make a call to a C-function;
503         * The primop carries C function prototype information and specifies
504         * which of its (ML-) arguments are floating point. C prototype
505         * information is for use by the backend, ML information is for
506         * use by the CPS converter. *)
507      | RAW_CCALL of { c_proto: CTypes.c_proto,
508                       ml_args: ccall_type list,
509                       ml_res_opt: ccall_type option,
510                       reentrant : bool
511                     } option
512       (* Allocate uninitialized storage on the heap.
513        * The record is meant to hold short-lived C objects, i.e., they
514        * are not ML pointers.  With the tag, the representation is
515        * the same as RECORD with tag tag_raw32 (sz=4), or tag_fblock (sz=8)
516        *)
517      | RAW_RECORD of {tag:bool,sz:int}
518      and ccall_type = CCALL_INT32 | CCALL_REAL64 | CCALL_ML_PTR
519    
520      2.  These CPS primops are now overloaded:
521    
522           rawload of {kind:numkind}
523           rawstore of {kind:numkind}
524    
525          The one argument form is:
526    
527             rawload {kind} address
528    
529          The two argument form is:
530    
531             rawload {kind} [ml object, byte-offset]
532    
533      3. RAW_CCALL/RCC now takes two extra arguments:
534    
535         a. The first is whether the C call is reentrant, i.e., whether
536            ML state should be saved and restored.
537         b. The second argument is a string argument specifying the name of
538            library and the C function.
539    
540         These things are currently not handled in the code generator, yet.
541    
542      4. In CProto,
543    
544         An encoding type of "bool" means "ml object" and is mapped into
545         C prototype of PTR.  Note that "bool" is different than "string",
546         even though "string" is also mapped into PTR, because "bool"
547         is assigned an CPS type of BOGt, while "string" is assigned INT32t.
548    
549      5. Pickler/unpicker
550    
551         Changed to handle RAW_RECORD and newest RAW_CCALL
552    
553      6. MLRiscGen,
554    
555         1. Changed to handle the new rawload/rawstore/rawrecord operators.
556         2. Code for handling C Calls has been moved to a new module CPSCCalls,
557            in the file CodeGen/cpscompile/cps-c-calls.sml
558    
559      7. Added the conditional move operator
560    
561             condmove of branch
562    
563         to cps.  Generation of this is still buggy so it is currently
564         disabled.
565    
566    ----------------------------------------------------------------------
567    Name: Lal George
568    Date: 2002/03/22 14:18:25 EST
569    Tag: george-20020322-cps-branch-prob
570    Description:
571    
572    Implemented the Ball-Larus branch prediction-heuristics, and
573    incorporated graphical viewers for control flow graphs.
574    
575    Ball-Larus Heuristics:
576    ---------------------
577    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
578    
579    By design it uses the Dempster-Shafer theory for combining
580    probabilities.  For example, in the function:
581    
582        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
583    
584    the ball-larus heuristics predicts that the n=0 is unlikely
585    (OH-heuristic), and the 'then' branch is unlikely because of the
586    RH-heuristic -- giving the 'then' branch an even lower combined
587    probability using the Dempster-Shafer theory.
588    
589    Finally, John Reppy's loop analysis in MLRISC, further lowers the
590    probability of the 'then' branch because of the loop in the else
591    branch.
592    
593    
594    Graphical Viewing:
595    ------------------
596    I merely plugged in Allen's graphical viewers into the compiler. The
597    additional code is not much. At the top level, saying:
598    
599            Control.MLRISC.getFlag "cfg-graphical-view" := true;
600    
601    will display the graphical view of the control flow graph just before
602    back-patching.  daVinci must be in your path for this to work. If
603    daVinci is not available, then the default viewer can be changed
604    using:
605    
606            Control.MLRISC.getString "viewer"
607    
608    which can be set to "dot" or "vcg" for the corresponding viewers. Of
609    course, these viewers must be in your path.
610    
611    The above will display the compilation unit at the level of clusters,
612    many of which are small, boring, and un-interesting. Also setting:
613    
614            Control.MLRISC.getInt "cfg-graphical-view_size"
615    
616    will display clusters that are larger than the value set by the above.
617    
618    
619    ----------------------------------------------------------------------
620    Name: Matthias Blume
621    Date: 2002/03/21 22:20:00 EST
622    Tag: blume-20020321-kmp-bugfix
623    Description:
624    
625    Changed the interface to the KMP routine in PreString and fixed
626    a minor bug in one place where it was used.
627    
628    ----------------------------------------------------------------------
629    Name: Allen Leung
630    Date: 2002/03/21 20:30:00 EST
631    Tag: leunga-20020321-cfg
632    Description:
633    
634      Fixed a potential problem in cfg edge splitting.
635    
636    ----------------------------------------------------------------------
637    Name: Allen Leung
638    Date: 2002/03/21 17:15:00 EST
639    Tag: leunga-20020321-x86-fp-cfg
640    Description:
641    
642      1. Recoded the buggy parts of x86-fp.
643    
644         a. All the block reordering code has been removed.
645            We now depend on the block placement phases to do this work.
646    
647         b. Critical edge splitting code has been simplified and moved into the
648            CFG modules, as where they belong.
649    
650         Both of these were quite buggy and complex.  The code is now much, much
651         simpler.
652    
653      2. X86 backend.
654    
655         a. Added instructions for 64-bit support.  Instruction selection for
656            64-bit has not been committed, however, since that
657            requires changes to MLTREE which haven't been approved by
658            Lal and John.
659    
660         b. Added support for FUCOMI and FUCOMIP when generating code for
661            PentiumPro and above.  We only generate these instructions in
662            the fast-fp mode.
663    
664         c. Added cases for JP and JNP in X86FreqProps.
665    
666      3. CFG
667    
668         CFG now has a bunch of methods for edge splitting and merging.
669    
670      4. Machine description.
671    
672         John's simplification of MLTREE_BASIS.fcond broke a few machine
673         description things:
674    
675         rtl-build.{sig,sml} and hppa.mdl fixed.
676    
677         NOTE: the machine description stuff in the repository is still broken.
678               Again, I can't put my fixes in because that involves
679               changes to MLTREE.
680    
681    ----------------------------------------------------------------------
682    Name: Matthias Blume
683    Date: 2002/03/20 15:55:00 EST
684    Tag: blume-20020320-kmp
685    Description:
686    
687    Implemented Knuth-Morris-Pratt string matching in PreString and used
688    it for String.isSubstring, Substring.isSubstring, and
689    Substring.position.
690    
691    (Might need some stress-testing.  Simple examples worked fine.)
692    
693    ----------------------------------------------------------------------
694    Name: Matthias Blume
695    Date: 2002/03/19 16:37:00 EST
696    Tag: blume-20020319-witnesses
697    Description:
698    
699    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
700    
701    This implements a generic mechanism for changing constness qualifiers
702    anywhere within big C types without resorting to outright "casts".
703    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
704    modify the constness at the outermost level.)
705    The implementation of "convert" is based on the idea of "witness"
706    values -- values that are not used by the operation but whose types
707    "testify" to their applicability.  On the implementation side, "convert"
708    is simply a projection (returning its second curried argument).  With
709    cross-module inlining, it should not result in any machine code being
710    generated.
711    
712    ----------------------------------------------------------------------
713    Name: Matthias Blume
714    Date: 2002/03/15 16:40:00 EST
715    Tag: blume-20020315-basis
716    Description:
717    
718    Provided (preliminary?) implementations for
719    
720      {String,Substring}.{concatWith,isSuffix,isSubstring}
721    
722    and
723    
724      Substring.full
725    
726    Those are in the Basis spec but they were missing in SML/NJ.
727    
728    ----------------------------------------------------------------------
729    Name: Matthias Blume
730    Date: 2002/03/14 21:30:00 EST
731    Tag: blume-20020314-controls
732    Description:
733    
734    Controls:
735    ---------
736    
737    1. Factored out the recently-added Controls : CONTROLS stuff and put
738       it into its own library $/controls-lib.cm.  The source tree for
739       this is under src/smlnj-lib/Controls.
740    
741    2. Changed the names of types and functions in this interface, so they
742       make a bit more "sense":
743    
744          module -> registry
745          'a registry -> 'a group
746    
747    3. The interface now deals in ref cells only.  The getter/setter interface
748       is (mostly) gone.
749    
750    4. Added a function that lets one register an already-existing ref cell.
751    
752    5. Made the corresponding modifications to the rest of the code so that
753       everything compiles again.
754    
755    6. Changed the implementation of Controls.MLRISC back to something closer
756       to the original.  In particular, this module (and therefore MLRISC)
757       does not depend on Controls.  There now is some link-time code in
758       int-sys.sml that registers the MLRISC controls with the Controls
759       module.
760    
761    CM:
762    ---
763    
764      * One can now specify the lambda-split aggressiveness in init.cmi.
765    
766    ----------------------------------------------------------------------
767    Name: Allen Leung
768    Date: 2002/03/13 17:30:00 EST
769    Tag: leunga-20020313-x86-fp-unary
770    Description:
771    
772    Bug fix for:
773    
774    > leunga@weaselbane:~/Yale/tmp/sml-dist{21} bin/sml
775    > Standard ML of New Jersey v110.39.1 [FLINT v1.5], March 08, 2002
776    > - fun f(x,(y,z)) = Real.~ y;
777    > [autoloading]
778    > [autoloading done]
779    >       fchsl   (%eax), 184(%esp)
780    > Error: MLRisc bug: X86MCEmitter.emitInstr
781    >
782    > uncaught exception Error
783    >   raised at: ../MLRISC/control/mlriscErrormsg.sml:16.14-16.19
784    
785    The problem was that the code generator did not generate any fp registers
786    in this case, and the ra didn't know that it needed to run the X86FP phase to
787    translate the pseudo fp instruction.   This only happened with unary fp
788    operators in certain situations.
789    
790    ----------------------------------------------------------------------
791    Name: Matthias Blume
792    Date: 2002/03/13 14:00:00 EST
793    Tag: blume-20020313-overload-etc
794    Description:
795    
796    1. Added _overload as a synonym for overload for backward compatibility.
797       (Control.overloadKW must be true for either version to be accepted.)
798    
799    2. Fixed bug in install script that caused more things to be installed
800       than what was requested in config/targets.
801    
802    3. Made CM aware of the (_)overload construct so that autoloading
803       works.
804    
805    ----------------------------------------------------------------------
806    Name: Matthias Blume
807    Date: 2002/03/12 22:03:00 EST
808    Tag: blume-20020312-url
809    Description:
810    
811    Forgot to update BOOT and srcarchiveurl.
812    
813    ----------------------------------------------------------------------
814    Name: Matthias Blume
815    Date: 2002/03/12 17:30:00 EST
816    Tag: blume-20020312-version110392
817    Description:
818    
819    Yet another version number bump (because of small changes to the
820    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
821    
822    Changes:
823    
824      The new pid generation scheme described a few weeks ago was overly
825      complicated.  I implemented a new mechanism that is simpler and
826      provides a bit more "stability":  Once CM has seen a compilation
827      unit, it keeps its identity constant (as long as you do not delete
828      those crucial CM/GUID/* files).  This means that when you change
829      an interface, compile, then go back to the old interface, and
830      compile again, you arrive at the original pid.
831    
832      There now also is a mechanism that instructs CM to use the plain
833      environment hash as a module's pid (effectively making its GUID
834      the empty string).  For this, "noguid" must be specified as an
835      option to the .sml file in question within its .cm file.
836      This is most useful for code that is being generated by tools such
837      as ml-nlffigen (because during development programmers tend to
838      erase the tool's entire output directory tree including CM's cached
839      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
840      revert to the old, broken behavior of SML/NJ, but in specific cases
841      where there is no danger of interface confusion, its use is ok
842      (I think).
843    
844      ml-nlffigen by default generates "noguid" annotations.  They can be
845      turned off by specifying -guid in its command line.
846    
847    ----------------------------------------------------------------------
848    Name: Lal George
849    Date: 2002/03/12 12 14:42:36 EST
850    Tag: george-20020312-frequency-computation
851    Description:
852    
853    Integrated jump chaining and static block frequency into the
854    compiler. More details and numbers later.
855    
856    ----------------------------------------------------------------------
857    Name: Lal George
858    Date: 2002/03/11 11 22:38:53 EST
859    Tag: george-20020311-jump-chain-elim
860    Description:
861    
862    Tested the jump chain elimination on all architectures (except the
863    hppa).  This is on by default right now and is profitable for the
864    alpha and x86, however, it may not be profitable for the sparc and ppc
865    when compiling the compiler.
866    
867    The gc test will typically jump to a label at the end of the cluster,
868    where there is another jump to an external cluster containing the actual
869    code to invoke gc. This is to allow factoring of common gc invocation
870    sequences. That is to say, we generate:
871    
872            f:
873               testgc
874               ja   L1      % jump if above to L1
875    
876            L1:
877               jmp L2
878    
879    
880    After jump chain elimination the 'ja L1' instructions is converted to
881    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
882    up being implemented in their long form (if L2 is far away) using:
883    
884            jbe     L3      % jump if below or equal to L3
885            jmp     L2
886         L3:
887            ...
888    
889    
890    For large compilation units L2  may be far away.
891    
892    
893    ----------------------------------------------------------------------
894    Name: Matthias Blume
895    Date: 2002/03/11 13:30:00 EST
896    Tag: blume-20020311-mltreeeval
897    Description:
898    
899    A functor parameter was missing.
900    
901    ----------------------------------------------------------------------
902    Name: Allen Leung
903    Date: 2002/03/11 10:30:00 EST
904    Tag: leunga-20020311-runtime-string0
905    Description:
906    
907       The representation of the empty string now points to a
908    legal null terminated C string instead of unit.  It is now possible
909    to convert an ML string into C string with InlineT.CharVector.getData.
910    This compiles into one single machine instruction.
911    
912    ----------------------------------------------------------------------
913    Name: Allen Leung
914    Date: 2002/03/10 23:55:00 EST
915    Tag: leunga-20020310-x86-call
916    Description:
917    
918       Added machine generation for CALL instruction (relative displacement mode)
919    
920    ----------------------------------------------------------------------
921    Name: Matthias Blume
922    Date: 2002/03/08 16:05:00
923    Tag: blume-20020308-entrypoints
924    Description:
925    
926    Version number bumped to 110.39.1.  NEW BOOTFILES!
927    
928    Entrypoints: non-zero offset into a code object where execution should begin.
929    
930    - Added the notion of an entrypoint to CodeObj.
931    - Added reading/writing of entrypoint info to Binfile.
932    - Made runtime system bootloader aware of entrypoints.
933    - Use the address of the label of the first function given to mlriscGen
934      as the entrypoint.  This address is currently always 0, but it will
935      not be 0 once we turn on block placement.
936    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
937      with entry points) from mlriscGen.
938    
939    ----------------------------------------------------------------------
940    Name: Allen Leung
941    Date: 2002/03/07 20:45:00 EST
942    Tag: leunga-20020307-x86-cmov
943    Description:
944    
945       Bug fixes for CMOVcc on x86.
946    
947       1. Added machine code generation for CMOVcc
948       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
949       3. CMOVcc cannot have an immediate operand as argument.
950    
951    ----------------------------------------------------------------------
952    Name: Matthias Blume
953    Date: 2002/03/07 16:15:00 EST
954    Tag: blume-20020307-controls
955    Description:
956    
957    This is a very large but mostly boring patch which makes (almost)
958    every tuneable compiler knob (i.e., pretty much everything under
959    Control.* plus a few other things) configurable via both the command
960    line and environment variables in the style CM did its configuration
961    until now.
962    
963    Try starting sml with '-h' (or, if you are brave, '-H')
964    
965    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
966    implements the underlying generic mechanism.
967    
968    The interface to some of the existing such facilities has changed somewhat.
969    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
970    (The getFoo interface is still there for backward-compatibility, but its
971    use is deprecated.)
972    
973    The ml-build script passes -Cxxx=yyy command-line arguments through so
974    that one can now twiddle the compiler settings when using this "batch"
975    compiler.
976    
977    TODO items:
978    
979    We should go through and throw out all controls that are no longer
980    connected to anything.  Moreover, we should go through and provide
981    meaningful (and correct!) documentation strings for those controls
982    that still are connected.
983    
984    Currently, multiple calls to Controls.new are accepted (only the first
985    has any effect).  Eventually we should make sure that every control
986    is being made (via Controls.new) exactly once.  Future access can then
987    be done using Controls.acc.
988    
989    Finally, it would probably be a good idea to use the getter-setter
990    interface to controls rather than ref cells.  For the time being, both
991    styles are provided by the Controls module, but getter-setter pairs are
992    better if thread-safety is of any concern because they can be wrapped.
993    
994    *****************************************
995    
996    One bug fix: The function blockPlacement in three of the MLRISC
997    backpatch files used to be hard-wired to one of two possibilities at
998    link time (according to the value of the placementFlag).  But (I
999    think) it should rather sense the flag every time.
1000    
1001    *****************************************
1002    
1003    Other assorted changes (by other people who did not supply a HISTORY entry):
1004    
1005    1. the cross-module inliner now works much better (Monnier)
1006    2. representation of weights, frequencies, and probabilities in MLRISC
1007       changed in preparation of using those for weighted block placement
1008       (Reppy, George)
1009    
1010    ----------------------------------------------------------------------
1011    Name: Lal George
1012    Date: 2002/03/07 14:44:24 EST 2002
1013    Tag: george-20020307-weighted-block-placement
1014    
1015    Tested the weighted block placement optimization on all architectures
1016    (except the hppa) using AMPL to generate the block and edge frequencies.
1017    Changes were required in the machine properties to correctly
1018    categorize trap instructions. There is an MLRISC flag
1019    "weighted-block-placement" that can be used to enable weighted block
1020    placement, but this will be ineffective without block/edge
1021    frequencies (coming soon).
1022    
1023    
1024    ----------------------------------------------------------------------
1025    Name: Lal George
1026    Date: 2002/03/05 17:24:48 EST
1027    Tag: george-20020305-linkage-cluster
1028    
1029    In order to support the block placement optimization, a new cluster
1030    is generated as the very first cluster (called the linkage cluster).
1031    It contains a single jump to the 'real' entry point for the compilation
1032    unit. Block placement has no effect on the linkage cluster itself, but
1033    all the other clusters  have full freedom in the manner in which they
1034    reorder blocks or functions.
1035    
1036    On the x86 the typical linkage code that is generated is:
1037       ----------------------
1038            .align 2
1039       L0:
1040            addl    $L1-L0, 72(%esp)
1041            jmp     L1
1042    
1043    
1044            .align  2
1045       L1:
1046       ----------------------
1047    
1048    72(%esp) is the memory location for the stdlink register. This
1049    must contain the address of the CPS function being called. In the
1050    above example, it contains the address of  L0; before
1051    calling L1 (the real entry point for the compilation unit), it
1052    must contain the address for L1, and hence
1053    
1054            addl $L1-L0, 72(%esp)
1055    
1056    I have tested this on all architectures except the hppa.The increase
1057    in code size is of course negligible
1058    
1059    ----------------------------------------------------------------------
1060    Name: Allen Leung
1061    Date: 2002/03/03 13:20:00 EST
1062    Tag: leunga-20020303-mlrisc-tools
1063    
1064      Added #[ ... ] expressions to mlrisc tools
1065    
1066    ----------------------------------------------------------------------
1067    Name: Matthias Blume
1068    Date: 2002/02/27 12:29:00 EST
1069    Tag: blume-20020227-cdebug
1070    Description:
1071    
1072    - made types in structure C and C_Debug to be equal
1073    - got rid of code duplication (c-int.sml vs. c-int-debug.sml)
1074    - there no longer is a C_Int_Debug (C_Debug is directly derived from C)
1075    
1076    ----------------------------------------------------------------------
1077    Name: Matthias Blume
1078    Date: 2002/02/26 12:00:00 EST
1079    Tag: blume-20020226-ffi
1080    Description:
1081    
1082    1. Fixed a minor bug in CM's "noweb" tool:
1083       If numbering is turned off, then truly don't number (i.e., do not
1084       supply the -L option to noweb).  The previous behavior was to supply
1085       -L'' -- which caused noweb to use the "default" line numbering scheme.
1086       Thanks to Chris Richards for pointing this out (and supplying the fix).
1087    
1088    2. Once again, I reworked some aspects of the FFI:
1089    
1090       A. The incomplete/complete type business:
1091    
1092       - Signatures POINTER_TO_INCOMPLETE_TYPE and accompanying functors are
1093         gone!
1094       - ML types representing an incomplete type are now *equal* to
1095         ML types representing their corresponding complete types (just like
1096         in C).  This is still safe because ml-nlffigen will not generate
1097         RTTI for incomplete types, nor will it generate functions that
1098         require access to such RTTI.   But when ML code generated from both
1099         incomplete and complete versions of the C type meet, the ML types
1100         are trivially interoperable.
1101    
1102         NOTE:  These changes restore the full generality of the translation
1103         (which was previously lost when I eliminated functorization)!
1104    
1105       B. Enum types:
1106    
1107       - Structure C now has a type constructor "enum" that is similar to
1108         how the "su" constructor works.  However, "enum" is not a phantom
1109         type because each "T enum" has values (and is isomorphic to
1110         MLRep.Signed.int).
1111       - There are generic access operations for enum objects (using
1112         MLRep.Signed.int).
1113       - ml-nlffigen will generate a structure E_foo for each "enum foo".
1114         * The structure contains the definition of type "mlrep" (the ML-side
1115         representation type of the enum).  Normally, mlrep is the same
1116         as "MLRep.Signed.int", but if ml-nlffigen was invoked with "-ec",
1117         then mlrep will be defined as a datatype -- thus facilitating
1118         pattern matching on mlrep values.
1119         ("-ec" will be suppressed if there are duplicate values in an
1120          enumeration.)
1121         * Constructors ("-ec") or values (no "-ec") e_xxx of type mlrep
1122         will be generated for each C enum constant xxx.
1123         * Conversion functions m2i and i2m convert between mlrep and
1124         MLRep.Signed.int.  (Without "-ec", these functions are identities.)
1125         * Coversion functions c and ml convert between mlrep and "tag enum".
1126         * Access functions (get/set) fetch and store mlrep values.
1127       - By default (unless ml-nlffigen was invoked with "-nocollect"), unnamed
1128         enumerations are merged into one single enumeration represented by
1129         structure E_'.
1130    
1131    ----------------------------------------------------------------------
1132    Name: Allen Leung
1133    Date: 2002/02/25 04:45:00 EST
1134    Tag: leunga-20020225-cps-spill
1135    
1136    This is a new implementation of the CPS spill phase.
1137    The new phase is in the new file compiler/CodeGen/cpscompile/spill-new.sml
1138    In case of problems, replace it with the old file spill.sml
1139    
1140    The current compiler runs into some serious performance problems when
1141    constructing a large record.  This can happen when we try to compile a
1142    structure with many items.  Even a very simple structure like the following
1143    makes the compiler slow down.
1144    
1145        structure Foo = struct
1146           val x_1 = 0w1 : Word32.int
1147           val x_2 = 0w2 : Word32.int
1148           val x_3 = 0w3 : Word32.int
1149           ...
1150           val x_N = 0wN : Word32.int
1151        end
1152    
1153    The following table shows the compile time, from N=1000 to N=4000,
1154    with the old compiler:
1155    
1156    N
1157    1000   CPS 100 spill                           0.04u  0.00s  0.00g
1158           MLRISC ra                               0.06u  0.00s  0.05g
1159              (spills = 0 reloads = 0)
1160           TOTAL                                   0.63u  0.07s  0.21g
1161    
1162    1100   CPS 100 spill                           8.25u  0.32s  0.64g
1163           MLRISC ra                               5.68u  0.59s  3.93g
1164              (spills = 0 reloads = 0)
1165           TOTAL                                   14.71u  0.99s  4.81g
1166    
1167    1500   CPS 100 spill                           58.55u  2.34s  1.74g
1168           MLRISC ra                               5.54u  0.65s  3.91g
1169              (spills = 543 reloads = 1082)
1170           TOTAL                                   65.40u  3.13s  6.00g
1171    
1172    2000   CPS 100 spill                           126.69u  4.84s  3.08g
1173           MLRISC ra                               0.80u  0.10s  0.55g
1174              (spills = 42 reloads = 84)
1175           TOTAL                                   129.42u  5.10s  4.13g
1176    
1177    3000   CPS 100 spill                           675.59u  19.03s  11.64g
1178           MLRISC ra                               2.69u  0.27s  1.38g
1179              (spills = 62 reloads = 124)
1180           TOTAL                                   682.48u  19.61s  13.99g
1181    
1182    4000   CPS 100 spill                           2362.82u  56.28s  43.60g
1183           MLRISC ra                               4.96u  0.27s  2.72g
1184              (spills = 85 reloads = 170)
1185           TOTAL                                   2375.26u  57.21s  48.00g
1186    
1187    As you can see the old cps spill module suffers from some serious
1188    performance problem.  But since I cannot decipher the old code fully,
1189    instead of patching the problems up, I'm reimplementing it
1190    with a different algorithm.  The new code is more modular,
1191    smaller when compiled, and substantially faster
1192    (O(n log n) time and O(n) space).  Timing of the new spill module:
1193    
1194    4000  CPS 100 spill                           0.02u  0.00s  0.00g
1195          MLRISC ra                               0.25u  0.02s  0.15g
1196             (spills=1 reloads=3)
1197          TOTAL                                   7.74u  0.34s  1.62g
1198    
1199    Implementation details:
1200    
1201    As far as I can tell, the purpose of the CPS spill module is to make sure the
1202    number of live variables at any program point (the bandwidth)
1203    does not exceed a certain limit, which is determined by the
1204    size of the spill area.
1205    
1206    When the bandwidth is too large, we decrease the register pressure by
1207    packing live variables into spill records.  How we achieve this is
1208    completely different than what we did in the old code.
1209    
1210    First, there is something about the MLRiscGen code generator
1211    that we should be aware of:
1212    
1213    o MLRiscGen performs code motion!
1214    
1215       In particular, it will move floating point computations and
1216       address computations involving only the heap pointer to
1217       their use sites (if there is only a single use).
1218       What this means is that if we have a CPS record construction
1219       statement
1220    
1221           RECORD(k,vl,w,e)
1222    
1223       we should never count the new record address w as live if w
1224       has only one use (which is often the case).
1225    
1226       We should do something similar to floating point, but the transformation
1227       there is much more complex, so I won't deal with that.
1228    
1229    Secondly, there are now two new cps primops at our disposal:
1230    
1231     1. rawrecord of record_kind option
1232        This pure operator allocates some uninitialized storage from the heap.
1233        There are two forms:
1234    
1235         rawrecord NONE [INT n]  allocates a tagless record of length n
1236         rawrecord (SOME rk) [INT n] allocates a tagged record of length n
1237                                     and initializes the tag.
1238    
1239     2. rawupdate of cty
1240          rawupdate cty (v,i,x)
1241          Assigns to x to the ith component of record v.
1242          The storelist is not updated.
1243    
1244    We use these new primops for both spilling and increment record construction.
1245    
1246     1. Spilling.
1247    
1248        This is implemented with a linear scan algorithm (but generalized
1249        to trees).  The algorithm will create a single spill record at the
1250        beginning of the cps function and use rawupdate to spill to it,
1251        and SELECT or SELp to reload from it.  So both spills and reloads
1252        are fine-grain operations.  In contrast, in the old algorithm
1253        "spills" have to be bundled together in records.
1254    
1255        Ideally, we should sink the spill record construction to where
1256        it is needed.  We can even split the spill record into multiple ones
1257        at the places where they are needed.  But CPS is not a good
1258        representation for global code motion, so I'll keep it simple and
1259        am not attempting this.
1260    
1261     2. Incremental record construction (aka record splitting).
1262    
1263        Long records with many component values which are simulatenously live
1264        (recall that single use record addresses are not considered to
1265         be live) are constructed with rawrecord and rawupdate.
1266        We allocate space on the heap with rawrecord first, then gradually
1267        fill it in with rawupdate.  This is the technique suggested to me
1268        by Matthias.
1269    
1270        Some restrictions on when this is applicable:
1271        1. It is not a VECTOR record.  The code generator currently does not handle
1272           this case. VECTOR record uses double indirection like arrays.
1273        2. All the record component values are defined in the same "basic block"
1274           as the record constructor.  This is to prevent speculative
1275           record construction.
1276    
1277    ----------------------------------------------------------------------
1278    Name: Allen Leung
1279    Date: 2002/02/22 01:02:00 EST
1280    Tag: leunga-20020222-mlrisc-tools
1281    
1282    Minor bug fixes in the parser and rewriter
1283    
1284    ----------------------------------------------------------------------
1285    Name: Allen Leung
1286    Date: 2002/02/21 20:20:00 EST
1287    Tag: leunga-20020221-peephole
1288    
1289    Regenerated the peephole files.  Some contained typos in the specification
1290    and some didn't compile because of pretty printing bugs in the old version
1291    of 'nowhere'.
1292    
1293    ----------------------------------------------------------------------
1294  Name: Allen Leung  Name: Allen Leung
1295  Date: 2002/02/19 20:20:00 EST  Date: 2002/02/19 20:20:00 EST
1296  Tag: leunga-20020219-mlrisc-tools  Tag: leunga-20020219-mlrisc-tools

Legend:
Removed from v.1080  
changed lines
  Added in v.1282

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0