Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 903, Mon Aug 20 19:34:06 2001 UTC revision 1179, Tue Mar 26 03:04:46 2002 UTC
# Line 8  Line 8 
8  The form of an entry should be:  The form of an entry should be:
9    
10  Name:  Name:
11  Date:  Date: yyyy/mm/dd
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14    
15  ----------------------------------------------------------------------  ----------------------------------------------------------------------
16  Name: Matthias Blume  Name: Matthias Blume
17    Date: 2002/03/25 22:06:00 EST
18    Tag: blume-20020325-cprotobug
19    Description:
20    
21    Fixed a bug in cproto (c prototype decoder).
22    
23    ----------------------------------------------------------------------
24    Name: Matthias Blume
25    Date: 2002/03/25 16:00:00 EST
26    Tag: blume-20020325-raw-primops
27    Description:
28    
29    I did some cleanup to Allen's new primop code and
30    replaced yesterday's bootfiles with new ones.
31    (But they are stored in the same place.)
32    
33    ----------------------------------------------------------------------
34    Name: Matthias Blume
35    Date: 2002/03/24 22:40:00 EST
36    Tag: blume-20020324-bootfiles
37    Description:
38    
39    Made the bootfiles that Allen asked for.
40    
41    ----------------------------------------------------------------------
42    Name: Allen Leung
43    Date: 2002/03/23 15:50:00 EST
44    Tag: leunga-20020323-flint-cps-rcc-primops
45    Description:
46    
47      1. Changes to FLINT primops:
48    
49        (* make a call to a C-function;
50         * The primop carries C function prototype information and specifies
51         * which of its (ML-) arguments are floating point. C prototype
52         * information is for use by the backend, ML information is for
53         * use by the CPS converter. *)
54      | RAW_CCALL of { c_proto: CTypes.c_proto,
55                       ml_args: ccall_type list,
56                       ml_res_opt: ccall_type option,
57                       reentrant : bool
58                     } option
59       (* Allocate uninitialized storage on the heap.
60        * The record is meant to hold short-lived C objects, i.e., they
61        * are not ML pointers.  With the tag, the representation is
62        * the same as RECORD with tag tag_raw32 (sz=4), or tag_fblock (sz=8)
63        *)
64      | RAW_RECORD of {tag:bool,sz:int}
65      and ccall_type = CCALL_INT32 | CCALL_REAL64 | CCALL_ML_PTR
66    
67      2.  These CPS primops are now overloaded:
68    
69           rawload of {kind:numkind}
70           rawstore of {kind:numkind}
71    
72          The one argument form is:
73    
74             rawload {kind} address
75    
76          The two argument form is:
77    
78             rawload {kind} [ml object, byte-offset]
79    
80      3. RAW_CCALL/RCC now takes two extra arguments:
81    
82         a. The first is whether the C call is reentrant, i.e., whether
83            ML state should be saved and restored.
84         b. The second argument is a string argument specifying the name of
85            library and the C function.
86    
87         These things are currently not handled in the code generator, yet.
88    
89      4. In CProto,
90    
91         An encoding type of "bool" means "ml object" and is mapped into
92         C prototype of PTR.  Note that "bool" is different than "string",
93         even though "string" is also mapped into PTR, because "bool"
94         is assigned an CPS type of BOGt, while "string" is assigned INT32t.
95    
96      5. Pickler/unpicker
97    
98         Changed to handle RAW_RECORD and newest RAW_CCALL
99    
100      6. MLRiscGen,
101    
102         1. Changed to handle the new rawload/rawstore/rawrecord operators.
103         2. Code for handling C Calls has been moved to a new module CPSCCalls,
104            in the file CodeGen/cpscompile/cps-c-calls.sml
105    
106      7. Added the conditional move operator
107    
108             condmove of branch
109    
110         to cps.  Generation of this is still buggy so it is currently
111         disabled.
112    
113    ----------------------------------------------------------------------
114    Name: Lal George
115    Date: 2002/03/22 14:18:25 EST
116    Tag: george-20020322-cps-branch-prob
117    Description:
118    
119    Implemented the Ball-Larus branch prediction-heuristics, and
120    incorporated graphical viewers for control flow graphs.
121    
122    Ball-Larus Heuristics:
123    ---------------------
124    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
125    
126    By design it uses the Dempster-Shafer theory for combining
127    probabilities.  For example, in the function:
128    
129        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
130    
131    the ball-larus heuristics predicts that the n=0 is unlikely
132    (OH-heuristic), and the 'then' branch is unlikely because of the
133    RH-heuristic -- giving the 'then' branch an even lower combined
134    probability using the Dempster-Shafer theory.
135    
136    Finally, John Reppy's loop analysis in MLRISC, further lowers the
137    probability of the 'then' branch because of the loop in the else
138    branch.
139    
140    
141    Graphical Viewing:
142    ------------------
143    I merely plugged in Allen's graphical viewers into the compiler. The
144    additional code is not much. At the top level, saying:
145    
146            Control.MLRISC.getFlag "cfg-graphical-view" := true;
147    
148    will display the graphical view of the control flow graph just before
149    back-patching.  daVinci must be in your path for this to work. If
150    daVinci is not available, then the default viewer can be changed
151    using:
152    
153            Control.MLRISC.getString "viewer"
154    
155    which can be set to "dot" or "vcg" for the corresponding viewers. Of
156    course, these viewers must be in your path.
157    
158    The above will display the compilation unit at the level of clusters,
159    many of which are small, boring, and un-interesting. Also setting:
160    
161            Control.MLRISC.getInt "cfg-graphical-view_size"
162    
163    will display clusters that are larger than the value set by the above.
164    
165    
166    ----------------------------------------------------------------------
167    Name: Matthias Blume
168    Date: 2002/03/21 22:20:00 EST
169    Tag: blume-20020321-kmp-bugfix
170    Description:
171    
172    Changed the interface to the KMP routine in PreString and fixed
173    a minor bug in one place where it was used.
174    
175    ----------------------------------------------------------------------
176    Name: Allen Leung
177    Date: 2002/03/21 20:30:00 EST
178    Tag: leunga-20020321-cfg
179    Description:
180    
181      Fixed a potential problem in cfg edge splitting.
182    
183    ----------------------------------------------------------------------
184    Name: Allen Leung
185    Date: 2002/03/21 17:15:00 EST
186    Tag: leunga-20020321-x86-fp-cfg
187    Description:
188    
189      1. Recoded the buggy parts of x86-fp.
190    
191         a. All the block reordering code has been removed.
192            We now depend on the block placement phases to do this work.
193    
194         b. Critical edge splitting code has been simplified and moved into the
195            CFG modules, as where they belong.
196    
197         Both of these were quite buggy and complex.  The code is now much, much
198         simpler.
199    
200      2. X86 backend.
201    
202         a. Added instructions for 64-bit support.  Instruction selection for
203            64-bit has not been committed, however, since that
204            requires changes to MLTREE which haven't been approved by
205            Lal and John.
206    
207         b. Added support for FUCOMI and FUCOMIP when generating code for
208            PentiumPro and above.  We only generate these instructions in
209            the fast-fp mode.
210    
211         c. Added cases for JP and JNP in X86FreqProps.
212    
213      3. CFG
214    
215         CFG now has a bunch of methods for edge splitting and merging.
216    
217      4. Machine description.
218    
219         John's simplification of MLTREE_BASIS.fcond broke a few machine
220         description things:
221    
222         rtl-build.{sig,sml} and hppa.mdl fixed.
223    
224         NOTE: the machine description stuff in the repository is still broken.
225               Again, I can't put my fixes in because that involves
226               changes to MLTREE.
227    
228    ----------------------------------------------------------------------
229    Name: Matthias Blume
230    Date: 2002/03/20 15:55:00 EST
231    Tag: blume-20020320-kmp
232    Description:
233    
234    Implemented Knuth-Morris-Pratt string matching in PreString and used
235    it for String.isSubstring, Substring.isSubstring, and
236    Substring.position.
237    
238    (Might need some stress-testing.  Simple examples worked fine.)
239    
240    ----------------------------------------------------------------------
241    Name: Matthias Blume
242    Date: 2002/03/19 16:37:00 EST
243    Tag: blume-20020319-witnesses
244    Description:
245    
246    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
247    
248    This implements a generic mechanism for changing constness qualifiers
249    anywhere within big C types without resorting to outright "casts".
250    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
251    modify the constness at the outermost level.)
252    The implementation of "convert" is based on the idea of "witness"
253    values -- values that are not used by the operation but whose types
254    "testify" to their applicability.  On the implementation side, "convert"
255    is simply a projection (returning its second curried argument).  With
256    cross-module inlining, it should not result in any machine code being
257    generated.
258    
259    ----------------------------------------------------------------------
260    Name: Matthias Blume
261    Date: 2002/03/15 16:40:00 EST
262    Tag: blume-20020315-basis
263    Description:
264    
265    Provided (preliminary?) implementations for
266    
267      {String,Substring}.{concatWith,isSuffix,isSubstring}
268    
269    and
270    
271      Substring.full
272    
273    Those are in the Basis spec but they were missing in SML/NJ.
274    
275    ----------------------------------------------------------------------
276    Name: Matthias Blume
277    Date: 2002/03/14 21:30:00 EST
278    Tag: blume-20020314-controls
279    Description:
280    
281    Controls:
282    ---------
283    
284    1. Factored out the recently-added Controls : CONTROLS stuff and put
285       it into its own library $/controls-lib.cm.  The source tree for
286       this is under src/smlnj-lib/Controls.
287    
288    2. Changed the names of types and functions in this interface, so they
289       make a bit more "sense":
290    
291          module -> registry
292          'a registry -> 'a group
293    
294    3. The interface now deals in ref cells only.  The getter/setter interface
295       is (mostly) gone.
296    
297    4. Added a function that lets one register an already-existing ref cell.
298    
299    5. Made the corresponding modifications to the rest of the code so that
300       everything compiles again.
301    
302    6. Changed the implementation of Controls.MLRISC back to something closer
303       to the original.  In particular, this module (and therefore MLRISC)
304       does not depend on Controls.  There now is some link-time code in
305       int-sys.sml that registers the MLRISC controls with the Controls
306       module.
307    
308    CM:
309    ---
310    
311      * One can now specify the lambda-split aggressiveness in init.cmi.
312    
313    ----------------------------------------------------------------------
314    Name: Allen Leung
315    Date: 2002/03/13 17:30:00 EST
316    Tag: leunga-20020313-x86-fp-unary
317    Description:
318    
319    Bug fix for:
320    
321    > leunga@weaselbane:~/Yale/tmp/sml-dist{21} bin/sml
322    > Standard ML of New Jersey v110.39.1 [FLINT v1.5], March 08, 2002
323    > - fun f(x,(y,z)) = Real.~ y;
324    > [autoloading]
325    > [autoloading done]
326    >       fchsl   (%eax), 184(%esp)
327    > Error: MLRisc bug: X86MCEmitter.emitInstr
328    >
329    > uncaught exception Error
330    >   raised at: ../MLRISC/control/mlriscErrormsg.sml:16.14-16.19
331    
332    The problem was that the code generator did not generate any fp registers
333    in this case, and the ra didn't know that it needed to run the X86FP phase to
334    translate the pseudo fp instruction.   This only happened with unary fp
335    operators in certain situations.
336    
337    ----------------------------------------------------------------------
338    Name: Matthias Blume
339    Date: 2002/03/13 14:00:00 EST
340    Tag: blume-20020313-overload-etc
341    Description:
342    
343    1. Added _overload as a synonym for overload for backward compatibility.
344       (Control.overloadKW must be true for either version to be accepted.)
345    
346    2. Fixed bug in install script that caused more things to be installed
347       than what was requested in config/targets.
348    
349    3. Made CM aware of the (_)overload construct so that autoloading
350       works.
351    
352    ----------------------------------------------------------------------
353    Name: Matthias Blume
354    Date: 2002/03/12 22:03:00 EST
355    Tag: blume-20020312-url
356    Description:
357    
358    Forgot to update BOOT and srcarchiveurl.
359    
360    ----------------------------------------------------------------------
361    Name: Matthias Blume
362    Date: 2002/03/12 17:30:00 EST
363    Tag: blume-20020312-version110392
364    Description:
365    
366    Yet another version number bump (because of small changes to the
367    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
368    
369    Changes:
370    
371      The new pid generation scheme described a few weeks ago was overly
372      complicated.  I implemented a new mechanism that is simpler and
373      provides a bit more "stability":  Once CM has seen a compilation
374      unit, it keeps its identity constant (as long as you do not delete
375      those crucial CM/GUID/* files).  This means that when you change
376      an interface, compile, then go back to the old interface, and
377      compile again, you arrive at the original pid.
378    
379      There now also is a mechanism that instructs CM to use the plain
380      environment hash as a module's pid (effectively making its GUID
381      the empty string).  For this, "noguid" must be specified as an
382      option to the .sml file in question within its .cm file.
383      This is most useful for code that is being generated by tools such
384      as ml-nlffigen (because during development programmers tend to
385      erase the tool's entire output directory tree including CM's cached
386      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
387      revert to the old, broken behavior of SML/NJ, but in specific cases
388      where there is no danger of interface confusion, its use is ok
389      (I think).
390    
391      ml-nlffigen by default generates "noguid" annotations.  They can be
392      turned off by specifying -guid in its command line.
393    
394    ----------------------------------------------------------------------
395    Name: Lal George
396    Date: 2002/03/12 12 14:42:36 EST
397    Tag: george-20020312-frequency-computation
398    Description:
399    
400    Integrated jump chaining and static block frequency into the
401    compiler. More details and numbers later.
402    
403    ----------------------------------------------------------------------
404    Name: Lal George
405    Date: 2002/03/11 11 22:38:53 EST
406    Tag: george-20020311-jump-chain-elim
407    Description:
408    
409    Tested the jump chain elimination on all architectures (except the
410    hppa).  This is on by default right now and is profitable for the
411    alpha and x86, however, it may not be profitable for the sparc and ppc
412    when compiling the compiler.
413    
414    The gc test will typically jump to a label at the end of the cluster,
415    where there is another jump to an external cluster containing the actual
416    code to invoke gc. This is to allow factoring of common gc invocation
417    sequences. That is to say, we generate:
418    
419            f:
420               testgc
421               ja   L1      % jump if above to L1
422    
423            L1:
424               jmp L2
425    
426    
427    After jump chain elimination the 'ja L1' instructions is converted to
428    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
429    up being implemented in their long form (if L2 is far away) using:
430    
431            jbe     L3      % jump if below or equal to L3
432            jmp     L2
433         L3:
434            ...
435    
436    
437    For large compilation units L2  may be far away.
438    
439    
440    ----------------------------------------------------------------------
441    Name: Matthias Blume
442    Date: 2002/03/11 13:30:00 EST
443    Tag: blume-20020311-mltreeeval
444    Description:
445    
446    A functor parameter was missing.
447    
448    ----------------------------------------------------------------------
449    Name: Allen Leung
450    Date: 2002/03/11 10:30:00 EST
451    Tag: leunga-20020311-runtime-string0
452    Description:
453    
454       The representation of the empty string now points to a
455    legal null terminated C string instead of unit.  It is now possible
456    to convert an ML string into C string with InlineT.CharVector.getData.
457    This compiles into one single machine instruction.
458    
459    ----------------------------------------------------------------------
460    Name: Allen Leung
461    Date: 2002/03/10 23:55:00 EST
462    Tag: leunga-20020310-x86-call
463    Description:
464    
465       Added machine generation for CALL instruction (relative displacement mode)
466    
467    ----------------------------------------------------------------------
468    Name: Matthias Blume
469    Date: 2002/03/08 16:05:00
470    Tag: blume-20020308-entrypoints
471    Description:
472    
473    Version number bumped to 110.39.1.  NEW BOOTFILES!
474    
475    Entrypoints: non-zero offset into a code object where execution should begin.
476    
477    - Added the notion of an entrypoint to CodeObj.
478    - Added reading/writing of entrypoint info to Binfile.
479    - Made runtime system bootloader aware of entrypoints.
480    - Use the address of the label of the first function given to mlriscGen
481      as the entrypoint.  This address is currently always 0, but it will
482      not be 0 once we turn on block placement.
483    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
484      with entry points) from mlriscGen.
485    
486    ----------------------------------------------------------------------
487    Name: Allen Leung
488    Date: 2002/03/07 20:45:00 EST
489    Tag: leunga-20020307-x86-cmov
490    Description:
491    
492       Bug fixes for CMOVcc on x86.
493    
494       1. Added machine code generation for CMOVcc
495       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
496       3. CMOVcc cannot have an immediate operand as argument.
497    
498    ----------------------------------------------------------------------
499    Name: Matthias Blume
500    Date: 2002/03/07 16:15:00 EST
501    Tag: blume-20020307-controls
502    Description:
503    
504    This is a very large but mostly boring patch which makes (almost)
505    every tuneable compiler knob (i.e., pretty much everything under
506    Control.* plus a few other things) configurable via both the command
507    line and environment variables in the style CM did its configuration
508    until now.
509    
510    Try starting sml with '-h' (or, if you are brave, '-H')
511    
512    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
513    implements the underlying generic mechanism.
514    
515    The interface to some of the existing such facilities has changed somewhat.
516    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
517    (The getFoo interface is still there for backward-compatibility, but its
518    use is deprecated.)
519    
520    The ml-build script passes -Cxxx=yyy command-line arguments through so
521    that one can now twiddle the compiler settings when using this "batch"
522    compiler.
523    
524    TODO items:
525    
526    We should go through and throw out all controls that are no longer
527    connected to anything.  Moreover, we should go through and provide
528    meaningful (and correct!) documentation strings for those controls
529    that still are connected.
530    
531    Currently, multiple calls to Controls.new are accepted (only the first
532    has any effect).  Eventually we should make sure that every control
533    is being made (via Controls.new) exactly once.  Future access can then
534    be done using Controls.acc.
535    
536    Finally, it would probably be a good idea to use the getter-setter
537    interface to controls rather than ref cells.  For the time being, both
538    styles are provided by the Controls module, but getter-setter pairs are
539    better if thread-safety is of any concern because they can be wrapped.
540    
541    *****************************************
542    
543    One bug fix: The function blockPlacement in three of the MLRISC
544    backpatch files used to be hard-wired to one of two possibilities at
545    link time (according to the value of the placementFlag).  But (I
546    think) it should rather sense the flag every time.
547    
548    *****************************************
549    
550    Other assorted changes (by other people who did not supply a HISTORY entry):
551    
552    1. the cross-module inliner now works much better (Monnier)
553    2. representation of weights, frequencies, and probabilities in MLRISC
554       changed in preparation of using those for weighted block placement
555       (Reppy, George)
556    
557    ----------------------------------------------------------------------
558    Name: Lal George
559    Date: 2002/03/07 14:44:24 EST 2002
560    Tag: george-20020307-weighted-block-placement
561    
562    Tested the weighted block placement optimization on all architectures
563    (except the hppa) using AMPL to generate the block and edge frequencies.
564    Changes were required in the machine properties to correctly
565    categorize trap instructions. There is an MLRISC flag
566    "weighted-block-placement" that can be used to enable weighted block
567    placement, but this will be ineffective without block/edge
568    frequencies (coming soon).
569    
570    
571    ----------------------------------------------------------------------
572    Name: Lal George
573    Date: 2002/03/05 17:24:48 EST
574    Tag: george-20020305-linkage-cluster
575    
576    In order to support the block placement optimization, a new cluster
577    is generated as the very first cluster (called the linkage cluster).
578    It contains a single jump to the 'real' entry point for the compilation
579    unit. Block placement has no effect on the linkage cluster itself, but
580    all the other clusters  have full freedom in the manner in which they
581    reorder blocks or functions.
582    
583    On the x86 the typical linkage code that is generated is:
584       ----------------------
585            .align 2
586       L0:
587            addl    $L1-L0, 72(%esp)
588            jmp     L1
589    
590    
591            .align  2
592       L1:
593       ----------------------
594    
595    72(%esp) is the memory location for the stdlink register. This
596    must contain the address of the CPS function being called. In the
597    above example, it contains the address of  L0; before
598    calling L1 (the real entry point for the compilation unit), it
599    must contain the address for L1, and hence
600    
601            addl $L1-L0, 72(%esp)
602    
603    I have tested this on all architectures except the hppa.The increase
604    in code size is of course negligible
605    
606    ----------------------------------------------------------------------
607    Name: Allen Leung
608    Date: 2002/03/03 13:20:00 EST
609    Tag: leunga-20020303-mlrisc-tools
610    
611      Added #[ ... ] expressions to mlrisc tools
612    
613    ----------------------------------------------------------------------
614    Name: Matthias Blume
615    Date: 2002/02/27 12:29:00 EST
616    Tag: blume-20020227-cdebug
617    Description:
618    
619    - made types in structure C and C_Debug to be equal
620    - got rid of code duplication (c-int.sml vs. c-int-debug.sml)
621    - there no longer is a C_Int_Debug (C_Debug is directly derived from C)
622    
623    ----------------------------------------------------------------------
624    Name: Matthias Blume
625    Date: 2002/02/26 12:00:00 EST
626    Tag: blume-20020226-ffi
627    Description:
628    
629    1. Fixed a minor bug in CM's "noweb" tool:
630       If numbering is turned off, then truly don't number (i.e., do not
631       supply the -L option to noweb).  The previous behavior was to supply
632       -L'' -- which caused noweb to use the "default" line numbering scheme.
633       Thanks to Chris Richards for pointing this out (and supplying the fix).
634    
635    2. Once again, I reworked some aspects of the FFI:
636    
637       A. The incomplete/complete type business:
638    
639       - Signatures POINTER_TO_INCOMPLETE_TYPE and accompanying functors are
640         gone!
641       - ML types representing an incomplete type are now *equal* to
642         ML types representing their corresponding complete types (just like
643         in C).  This is still safe because ml-nlffigen will not generate
644         RTTI for incomplete types, nor will it generate functions that
645         require access to such RTTI.   But when ML code generated from both
646         incomplete and complete versions of the C type meet, the ML types
647         are trivially interoperable.
648    
649         NOTE:  These changes restore the full generality of the translation
650         (which was previously lost when I eliminated functorization)!
651    
652       B. Enum types:
653    
654       - Structure C now has a type constructor "enum" that is similar to
655         how the "su" constructor works.  However, "enum" is not a phantom
656         type because each "T enum" has values (and is isomorphic to
657         MLRep.Signed.int).
658       - There are generic access operations for enum objects (using
659         MLRep.Signed.int).
660       - ml-nlffigen will generate a structure E_foo for each "enum foo".
661         * The structure contains the definition of type "mlrep" (the ML-side
662         representation type of the enum).  Normally, mlrep is the same
663         as "MLRep.Signed.int", but if ml-nlffigen was invoked with "-ec",
664         then mlrep will be defined as a datatype -- thus facilitating
665         pattern matching on mlrep values.
666         ("-ec" will be suppressed if there are duplicate values in an
667          enumeration.)
668         * Constructors ("-ec") or values (no "-ec") e_xxx of type mlrep
669         will be generated for each C enum constant xxx.
670         * Conversion functions m2i and i2m convert between mlrep and
671         MLRep.Signed.int.  (Without "-ec", these functions are identities.)
672         * Coversion functions c and ml convert between mlrep and "tag enum".
673         * Access functions (get/set) fetch and store mlrep values.
674       - By default (unless ml-nlffigen was invoked with "-nocollect"), unnamed
675         enumerations are merged into one single enumeration represented by
676         structure E_'.
677    
678    ----------------------------------------------------------------------
679    Name: Allen Leung
680    Date: 2002/02/25 04:45:00 EST
681    Tag: leunga-20020225-cps-spill
682    
683    This is a new implementation of the CPS spill phase.
684    The new phase is in the new file compiler/CodeGen/cpscompile/spill-new.sml
685    In case of problems, replace it with the old file spill.sml
686    
687    The current compiler runs into some serious performance problems when
688    constructing a large record.  This can happen when we try to compile a
689    structure with many items.  Even a very simple structure like the following
690    makes the compiler slow down.
691    
692        structure Foo = struct
693           val x_1 = 0w1 : Word32.int
694           val x_2 = 0w2 : Word32.int
695           val x_3 = 0w3 : Word32.int
696           ...
697           val x_N = 0wN : Word32.int
698        end
699    
700    The following table shows the compile time, from N=1000 to N=4000,
701    with the old compiler:
702    
703    N
704    1000   CPS 100 spill                           0.04u  0.00s  0.00g
705           MLRISC ra                               0.06u  0.00s  0.05g
706              (spills = 0 reloads = 0)
707           TOTAL                                   0.63u  0.07s  0.21g
708    
709    1100   CPS 100 spill                           8.25u  0.32s  0.64g
710           MLRISC ra                               5.68u  0.59s  3.93g
711              (spills = 0 reloads = 0)
712           TOTAL                                   14.71u  0.99s  4.81g
713    
714    1500   CPS 100 spill                           58.55u  2.34s  1.74g
715           MLRISC ra                               5.54u  0.65s  3.91g
716              (spills = 543 reloads = 1082)
717           TOTAL                                   65.40u  3.13s  6.00g
718    
719    2000   CPS 100 spill                           126.69u  4.84s  3.08g
720           MLRISC ra                               0.80u  0.10s  0.55g
721              (spills = 42 reloads = 84)
722           TOTAL                                   129.42u  5.10s  4.13g
723    
724    3000   CPS 100 spill                           675.59u  19.03s  11.64g
725           MLRISC ra                               2.69u  0.27s  1.38g
726              (spills = 62 reloads = 124)
727           TOTAL                                   682.48u  19.61s  13.99g
728    
729    4000   CPS 100 spill                           2362.82u  56.28s  43.60g
730           MLRISC ra                               4.96u  0.27s  2.72g
731              (spills = 85 reloads = 170)
732           TOTAL                                   2375.26u  57.21s  48.00g
733    
734    As you can see the old cps spill module suffers from some serious
735    performance problem.  But since I cannot decipher the old code fully,
736    instead of patching the problems up, I'm reimplementing it
737    with a different algorithm.  The new code is more modular,
738    smaller when compiled, and substantially faster
739    (O(n log n) time and O(n) space).  Timing of the new spill module:
740    
741    4000  CPS 100 spill                           0.02u  0.00s  0.00g
742          MLRISC ra                               0.25u  0.02s  0.15g
743             (spills=1 reloads=3)
744          TOTAL                                   7.74u  0.34s  1.62g
745    
746    Implementation details:
747    
748    As far as I can tell, the purpose of the CPS spill module is to make sure the
749    number of live variables at any program point (the bandwidth)
750    does not exceed a certain limit, which is determined by the
751    size of the spill area.
752    
753    When the bandwidth is too large, we decrease the register pressure by
754    packing live variables into spill records.  How we achieve this is
755    completely different than what we did in the old code.
756    
757    First, there is something about the MLRiscGen code generator
758    that we should be aware of:
759    
760    o MLRiscGen performs code motion!
761    
762       In particular, it will move floating point computations and
763       address computations involving only the heap pointer to
764       their use sites (if there is only a single use).
765       What this means is that if we have a CPS record construction
766       statement
767    
768           RECORD(k,vl,w,e)
769    
770       we should never count the new record address w as live if w
771       has only one use (which is often the case).
772    
773       We should do something similar to floating point, but the transformation
774       there is much more complex, so I won't deal with that.
775    
776    Secondly, there are now two new cps primops at our disposal:
777    
778     1. rawrecord of record_kind option
779        This pure operator allocates some uninitialized storage from the heap.
780        There are two forms:
781    
782         rawrecord NONE [INT n]  allocates a tagless record of length n
783         rawrecord (SOME rk) [INT n] allocates a tagged record of length n
784                                     and initializes the tag.
785    
786     2. rawupdate of cty
787          rawupdate cty (v,i,x)
788          Assigns to x to the ith component of record v.
789          The storelist is not updated.
790    
791    We use these new primops for both spilling and increment record construction.
792    
793     1. Spilling.
794    
795        This is implemented with a linear scan algorithm (but generalized
796        to trees).  The algorithm will create a single spill record at the
797        beginning of the cps function and use rawupdate to spill to it,
798        and SELECT or SELp to reload from it.  So both spills and reloads
799        are fine-grain operations.  In contrast, in the old algorithm
800        "spills" have to be bundled together in records.
801    
802        Ideally, we should sink the spill record construction to where
803        it is needed.  We can even split the spill record into multiple ones
804        at the places where they are needed.  But CPS is not a good
805        representation for global code motion, so I'll keep it simple and
806        am not attempting this.
807    
808     2. Incremental record construction (aka record splitting).
809    
810        Long records with many component values which are simulatenously live
811        (recall that single use record addresses are not considered to
812         be live) are constructed with rawrecord and rawupdate.
813        We allocate space on the heap with rawrecord first, then gradually
814        fill it in with rawupdate.  This is the technique suggested to me
815        by Matthias.
816    
817        Some restrictions on when this is applicable:
818        1. It is not a VECTOR record.  The code generator currently does not handle
819           this case. VECTOR record uses double indirection like arrays.
820        2. All the record component values are defined in the same "basic block"
821           as the record constructor.  This is to prevent speculative
822           record construction.
823    
824    ----------------------------------------------------------------------
825    Name: Allen Leung
826    Date: 2002/02/22 01:02:00 EST
827    Tag: leunga-20020222-mlrisc-tools
828    
829    Minor bug fixes in the parser and rewriter
830    
831    ----------------------------------------------------------------------
832    Name: Allen Leung
833    Date: 2002/02/21 20:20:00 EST
834    Tag: leunga-20020221-peephole
835    
836    Regenerated the peephole files.  Some contained typos in the specification
837    and some didn't compile because of pretty printing bugs in the old version
838    of 'nowhere'.
839    
840    ----------------------------------------------------------------------
841    Name: Allen Leung
842    Date: 2002/02/19 20:20:00 EST
843    Tag: leunga-20020219-mlrisc-tools
844    Description:
845    
846       Minor bug fixes to the mlrisc-tools library:
847    
848       1.  Fixed up parsing colon suffixed keywords
849       2.  Added the ability to shut the error messages up
850       3.  Reimplemented the pretty printer and fixed up/improved
851           the pretty printing of handle and -> types.
852       4.  Fixed up generation of literal symbols in the nowhere tool.
853       5.  Added some SML keywords to to sml.sty
854    
855    ----------------------------------------------------------------------
856    Name: Matthias Blume
857    Date: 2002/02/19 16:20:00 EST
858    Tag: blume-20020219-cmffi
859    Description:
860    
861    A wild mix of changes, some minor, some major:
862    
863    * All C FFI-related libraries are now anchored under $c:
864        $/c.cm      --> $c/c.cm
865        $/c-int.cm  --> $c/internals/c-int.cm
866        $/memory.cm --> $c/memory/memory.cm
867    
868    * "make" tool (in CM) now treats its argument pathname slightly
869      differently:
870        1. If the native expansion is an absolute name, then before invoking
871           the "make" command on it, CM will apply OS.Path.mkRelative
872           (with relativeTo = OS.FileSys.getDir()) to it.
873        2. The argument will be passed through to subsequent phases of CM
874           processing without "going native".  In particular, if the argument
875           was an anchored path, then "make" will not lose track of that anchor.
876    
877    * Compiler backends now "know" their respective C calling conventions
878      instead of having to be told about it by ml-nlffigen.  This relieves
879      ml-nlffigen from one of its burdens.
880    
881    * The X86Backend has been split into X86CCallBackend and X86StdCallBackend.
882    
883    * Export C_DEBUG and C_Debug from $c/c.cm.
884    
885    * C type encoding in ml-nlffi-lib has been improved to model the conceptual
886      subtyping relationship between incomplete pointers and their complete
887      counterparts.  For this, ('t, 'c) ptr has been changed to 'o ptr --
888      with the convention of instantiating 'o with ('t, 'c) obj whenever
889      the pointer target type is complete.  In the incomplete case, 'o
890      will be instantiated with some "'c iobj" -- a type obtained by
891      using one of the functors PointerToIncompleteType or PointerToCompleteType.
892    
893      Operations that work on both incomplete and complete pointer types are
894      typed as taking an 'o ptr while operations that require the target to
895      be known are typed as taking some ('t, 'c) obj ptr.
896    
897      voidptr is now a bit "more concrete", namely "type voidptr = void ptr'"
898      where void is an eqtype without any values.  This makes it possible
899      to work on voidptr values using functions meant to operate on light
900      incomplete pointers.
901    
902    * As a result of the above, signature POINTER_TO_INCOMPLETE_TYPE has
903      been vastly simplified.
904    
905    ----------------------------------------------------------------------
906    Name: Matthias Blume
907    Date: 2002/02/19 10:48:00 EST
908    Tag: blume-20020219-pqfix
909    Description:
910    
911    Applied Chris Okasaki's bug fix for priority queues.
912    
913    ----------------------------------------------------------------------
914    Name: Matthias Blume
915    Date: 2002/02/15 17:05:00
916    Tag: Release_110_39
917    Description:
918    
919    Last-minute retagging is becoming a tradition... :-(
920    
921    This is the working release 110.39.
922    
923    ----------------------------------------------------------------------
924    Name: Matthias Blume
925    Date: 2002/02/15 16:00:00 EST
926    Tag: Release_110_39-orig
927    Description:
928    
929    Working release 110.39.  New bootfiles.
930    
931    (Update: There was a small bug in the installer so it wouldn't work
932    with all shells.  So I retagged. -Matthias)
933    
934    ----------------------------------------------------------------------
935    Name: Matthias Blume
936    Date: 2002/02/15 14:17:00 EST
937    Tag: blume-20020215-showbindings
938    Description:
939    
940    Added EnvRef.listBoundSymbols and CM.State.showBindings.  Especially
941    the latter can be useful for exploring what bindings are available at
942    the interactive prompt.  (The first function returns only the list
943    of symbols that are really bound, the second prints those but also the
944    ones that CM's autoloading mechanism knows about.)
945    
946    ----------------------------------------------------------------------
947    Name: Matthias Blume
948    Date: 2002/02/15 12:08:00 EST
949    Tag: blume-20020215-iptrs
950    Description:
951    
952    Two improvements to ml-nlffigen:
953    
954      1. Write files only if they do not exist or if their current contents
955         do not coincide with what's being written.  (That is, avoid messing
956         with the time stamps unless absolutely necessary.)
957    
958      2. Implement a "repository" mechanism for generated files related
959         to "incomplete pointer types".   See the README file for details.
960    
961    ----------------------------------------------------------------------
962    Name: Matthias Blume
963    Date: 2002/02/14 11:50:00 EST
964    Tag: blume-20020214-quote
965    Description:
966    
967    Added a type 't t_' to tag.sml (in ml-nlffi-lib.cm).  This is required
968    because of the new and improved tag generation scheme.  (Thanks to Allen
969    Leung for pointing it out.)
970    
971    ----------------------------------------------------------------------
972    Name: Lal George
973    Date: 2002/02/14 09:55:27 EST 2002
974    Tag: george-20020214-isabelle-bug
975    Description:
976    
977    Fixed the MLRISC bug sent by Markus Wenzel regarding the compilation
978    of Isabelle on the x86.
979    
980    From Allen:
981    -----------
982     I've found the problem:
983    
984         in ra-core.sml, I use the counter "blocked" to keep track of the
985         true number of elements in the freeze queue.  When the counter goes
986         to zero, I skip examining the queue.  But I've messed up the
987         bookkeeping in combine():
988    
989             else ();
990             case !ucol of
991               PSEUDO => (if !cntv > 0 then
992                     (if !cntu > 0 then blocked := !blocked - 1 else ();
993                                        ^^^^^^^^^^^^^^^^^^^^^^^
994                      moveu := mergeMoveList(!movev, !moveu)
995                     )
996                  else ();
997    
998         combine() is called to coalesce two nodes u and v.
999         I think I was thinking that if the move counts of u and v are both
1000         greater than zero then after they are coalesced then one node is
1001         removed from the freeze queue.  Apparently I was thinking that
1002         both u and v are of low degree, but that's clearly not necessarily true.
1003    
1004    
1005    02/12/2002:
1006        Here's the patch.  HOL now compiles.
1007    
1008        I don't know how this impact on performance (compile
1009        time or runtime).  This bug caused the RA (especially on the x86)
1010        to go thru the potential spill phase when there are still nodes on the
1011        freeze queue.
1012    
1013    
1014    
1015    
1016    ----------------------------------------------------------------------
1017    Name: Matthias Blume
1018    Date: 2002/02/13 22:40:00 EST
1019    Tag: blume-20020213-fptr-rtti
1020    Description:
1021    
1022    Fixed a bug in ml-nlffigen that was introduced with one of the previous
1023    updates.
1024    
1025    ----------------------------------------------------------------------
1026    Name: Matthias Blume
1027    Date: 2002/02/13 16:41:00 EST
1028    Tag: blume-20020213-cmlpq
1029    Description:
1030    
1031    Added new priority queue export symbols (which have just been added to
1032    smlnj-lib.cm) to CML's version of smlnj-lib.cm.  (Otherwise CML would
1033    not compile and the installer would choke.)
1034    
1035    ----------------------------------------------------------------------
1036    Name: Matthias Blume
1037    Date: 2002/02/13 16:15:00 EST
1038    Tag: blume-20020213-various
1039    Description:
1040    
1041    1. More tweaks to ml-nlffigen:
1042    
1043       - better internal datastructures (resulting in slight speedup)
1044       - "-match" option requires exact match
1045       - "localized" gensym counters (untagged structs/unions nested within
1046         other structs/unions or within typedefs get a fresh counter; their
1047         tag will be prefixed by a concatenation of their parents' tags)
1048       - bug fixes (related to calculation of transitive closure of types
1049         to be included in the output)
1050    
1051    2. Minor Basis updates:
1052    
1053       - added implementations for List.collate and Option.app
1054    
1055    ----------------------------------------------------------------------
1056    Name: Matthias Blume
1057    Date: 2002/02/11 15:55:00 EST
1058    Tag: blume-20020211-gensym
1059    Description:
1060    
1061    Added a "-gensym" option to command line of ml-nlffigen.  This can be
1062    used to specify a "stem" -- a string that is inserted in all "gensym'd"
1063    names (ML structure names that correspond to unnamed C structs, unions,
1064    and enums), so that separate runs of ml-nlffigen do not clash.
1065    
1066    ----------------------------------------------------------------------
1067    Name: Matthias Blume
1068    Date: 2002/02/11 12:05:00 EST
1069    Tag: blume-20020211-gensml
1070    Description:
1071    
1072    A quick fix for a problem with GenSML (in the pgraph-util library):
1073    Make generation of toplevel "local" optional.  (Strictly speaking,
1074    signature definitions within "local" are not legal SML.)
1075    
1076    Other than that: updates to INSTALL and cm/TODO.
1077    
1078    ----------------------------------------------------------------------
1079    Name: Matthias Blume
1080    Date: 2002/02/08 15:00:00 EST
1081    Tag: blume-20020208-uniquepid
1082    Description:
1083    
1084    0. Version number has been bumped to 110.38.1.  NEW BOOTFILES!!!
1085    
1086    1. The installer (config/install.sh) has gotten smarter:
1087    
1088         - Configuration options are a bit easier to specify now
1089           (in config/targets).
1090         - Bug in recognizing .tar.bz2 files fixed.
1091         - Installer automatically resolves dependencies between
1092           configuration options (e.g., if you ask for eXene, you will
1093           also get cml -- regardless whether you asked for it or not).
1094         - Installer can run in "quieter mode" by setting the environment
1095           variable INSTALL_QUIETLY to "true".  "Quieter" does not mean
1096           "completely silent", though.
1097         - Build HashCons library as part of smlnj-lib.
1098    
1099    2. A new scheme for assigning persistent identifiers to compilation
1100       units (and, by extension, to types etc.) has been put into place.
1101       This fixes a long-standing bug where types and even dynamic values
1102       can get internally confused, thereby compromising type safety
1103       (abstraction) and dynamic correctness.  See
1104    
1105         http://cm.bell-labs.com/cm/cs/who/blume/pid-confusion.tgz
1106    
1107       for an example of how things could go wrong until now.
1108    
1109       The downside of the new scheme is that pids are not quite as
1110       persistent as they used to be: CM will generate a fresh pid
1111       for every compilation unit that it thinks it sees for the first
1112       time.  That means that if you compile starting from a clean, fresh
1113       source tree at two different times, you end up with different
1114       binaries.
1115    
1116       Cutoff recompilation, however, has not been compromised because
1117       CM keeps pid information in special caches between runs.
1118    
1119    ----------------------------------------------------------------------
1120    Name: Lal George
1121    Date: 2002/02/07 15:34:13 EST 2002
1122    Tag: <none>
1123    Description:
1124    
1125    Compilers that generate assembly code may produce  global labels
1126    whose value is resolved at link time. The various peephole optimization
1127    modules did not take this in account.
1128    
1129    TODO. The Labels.addrOf function should really return an option
1130    type so that clients are forced to deal with this issue, rather
1131    than an exception being raised.
1132    
1133    ----------------------------------------------------------------------
1134    Name: Lal George
1135    Date: 2002/02/06 13:55:02 EST
1136    Tag: george-20020206-ra-breakup
1137    Description:
1138    
1139    1. A bug fix from Allen.
1140    
1141        A typo causes extra fstp %st(0)'s to be generated at compensation
1142        edges, which might cause stack underflow traps at runtime.  This
1143        occurs in fft where there are extraneous fstps right before the 'into'
1144        trap instruction (in this case they are harmless since none of the
1145        integers overflow.)
1146    
1147    2. Pulled out various utility modules that were embedded in the modules
1148       of the register allocator. I need these modules for other purposes, but
1149       they are not complete enough to put into a library (just yet).
1150    ----------------------------------------------------------------------
1151    Name: Matthias Blume
1152    Date: 2002/01/31 16:05:00 EST
1153    Tag: blume-20020131-sparc-ccalls
1154    Description:
1155    
1156    1. C-calls on Sparc needlessly allocated a huge chunk (96 bytes)
1157       of extra stack space by mistake.  Fixed.
1158    
1159    2. Bug in logic of handling of command-line options in ml-nlffigen fixed.
1160    
1161    ----------------------------------------------------------------------
1162    Name: Allen Leung
1163    Date: 2002/01/30
1164    Tag: leunga-20020130-nowhere-bug-fix
1165    Description:
1166    
1167       MLRISC bug fixes:
1168       1. Fixed a bindings computation bug in the 'nowhere' program generator tool.
1169       2. MachineInt.fromString was negating its value.
1170    
1171    ----------------------------------------------------------------------
1172    Name: Matthias Blume
1173    Date: 2002/01/29
1174    Tag: blume-20020129-INSTALL
1175    Description:
1176    
1177    - Added somewhat detailed installation instructions (file INSTALL).
1178    - Fixed curl-detection bug in config/install.sh.
1179    - It is now possible to select the URL getter using the URLGETTER
1180      environment variable:
1181    
1182          not set / "unknown"      --> automatic detection (script tries wget,
1183                                       curl, and lynx)
1184          "wget" / "curl" / "lynx" --> use the specified program (script "knows"
1185                                       how to properly invoke them)
1186          other                    --> use $URLGETTER directly, it must take
1187                                       precisely two command-line arguments
1188                                       (source URL and destination file name)
1189    
1190    ----------------------------------------------------------------------
1191    Name: Matthias Blume
1192    Date: 2002/01/28
1193    Tag: blume-20020128-sparc-ccalls
1194    Description:
1195    
1196    - Fixed problem with calculation of "used" registers in sparc-c-calls.
1197    - Make use of the allocParam argument in sparc-c-calls.
1198    
1199    ----------------------------------------------------------------------
1200    Name: Matthias Blume
1201    Date: 2002/01/28
1202    Tag: blume-20020128-allocParam
1203    Description:
1204    
1205    John Reppy:  Changes c-calls API to accept client-callback for
1206    allocating extra stack space.
1207    me: Corresponding changes to mlriscGen (using a dummy argument that
1208        does not change the current behavior).
1209    
1210    ----------------------------------------------------------------------
1211    Name: Matthias Blume
1212    Date: 2002/01/28 12:00:00
1213    Tag: Release_110_38
1214    Description:
1215    
1216    This time for real!!!
1217    
1218    ----------------------------------------------------------------------
1219    Name: Matthias Blume
1220    Date: 2002/01/28 10:56:00 EST
1221    Tag: blume-20020128-retraction
1222    Description:
1223    
1224    0. Retracted earlier 110.38.  (The Release_110_38 tag has been replaced
1225       with blume-Release_110_38-retracted.)
1226    
1227    1. Fixed a problem with incorrect rounding modes in real64.sml.
1228       (Thanks to Andrew Mccreight <andrew.mccreight@yale.edu>.)
1229    
1230    2. A bug in ml-nlffigen related to the handling of unnamed structs, unions,
1231       and enums fixed.  The naming of corresponding ML identifiers should
1232       now be consistent again.
1233    
1234    ----------------------------------------------------------------------
1235    Name: Allen Leung
1236    Date: 2002/01/27
1237    Tag: leunga-20020127-nowhere
1238    Description:
1239    
1240       Added a target called nowhere in the configuration scripts.
1241       Enabling this will build the MLRISC 'nowhere' tool (for translating
1242       programs with where-clauses into legal SML code) during installation.
1243    
1244    ----------------------------------------------------------------------
1245    Name: Matthias Blume
1246    Date: 2002/01/25 21:27:00 EST
1247    Tag: blume-Release_110_38-retracted
1248    Description:
1249    
1250    Call it a (working) release!  Version is 110.38. Bootfiles are ready.
1251    
1252    README will be added later.
1253    
1254    !!! NOTE:  Re-tagged as blume-Release_110_38-retracted. Original tag
1255    (Release_110_38) removed.  Reason: Last-minute bug fixes.
1256    
1257    ----------------------------------------------------------------------
1258    Name: Matthias Blume
1259    Date: 2002/01/25
1260    Tag: blume-20020125-ffi
1261    Description:
1262    
1263    A large number of tweaks and improvements to ml-nlffi-lib and
1264    ml-nlffigen:
1265    
1266       - ML represenation types have been streamlined
1267       - getter and setter functions work with concrete values, not abstract
1268         ones where possible
1269       - ml-nlffigen command line more flexible (see README file there)
1270       - some bugs have been fixed (hopefully)
1271    
1272    ----------------------------------------------------------------------
1273    Name: Lal George
1274    Date: 2002/01/24
1275    Tag: george-20020124-risc-ra-interface
1276    Description:
1277    
1278       There is a dramatic simplification in the interface to the
1279       register allocator for RISC architectures as a result of making
1280       parallel copy instructions explicit.
1281    
1282    ----------------------------------------------------------------------
1283    Name: Matthias Blume
1284    Date: 2002/01/22
1285    Tag: blume-20020122-x86-ccalls
1286    Description:
1287    
1288    Bug fix for c-calls on x86 (having to do with how char- and
1289    short-arguments are being handled).
1290    
1291    ----------------------------------------------------------------------
1292    Name: Matthias Blume
1293    Date: 2002/01/21
1294    Tag: blume-20020121-ff
1295    Description:
1296    
1297    Another day of fiddling with the FFI...
1298    
1299    1. Bug fix/workaround:  CKIT does not complain about negative array
1300       dimensions, so ml-nlffigen has to guard itself against this possibility.
1301       (Otherwise a negative dimension would send it into an infinite loop.)
1302    
1303    2. Some of the abstract types (light objects, light pointers, most "base"
1304       types) in structure C are now eqtypes.
1305    
1306    3. Added constructors and test functions for NULL function pointers.
1307    
1308    ----------------------------------------------------------------------
1309    Name: Matthias Blume
1310    Date: 2002/01/18
1311    Tag: blume-20020118-ready-for-new-release
1312    Description:
1313    
1314    Made config/srcarchiveurl point to a new place.  (Will provide boot
1315    files shortly.)
1316    
1317    Maybe we christen this to be 110.38?
1318    
1319    ----------------------------------------------------------------------
1320    Name: Matthias Blume
1321    Date: 2002/01/18
1322    Tag: blume-20020118-more-ffifiddle
1323    Description:
1324    
1325    Today's FFI fiddling:
1326    
1327      - Provided a structure CGetSet with "convenient" versions of C.Get.* and
1328        C.Set.* that use concrete (MLRep.*) arguments and results instead
1329        of abstract ones.
1330    
1331      - Provided word-style bit operations etc. for "int" representation
1332        types in MLRep.S<Foo>Bitops where <Foo> ranges over Char, Int, Short,
1333        and Long.
1334    
1335    ----------------------------------------------------------------------
1336    Name: Matthias Blume
1337    Date: 2002/01/18
1338    Tag: blume-20020118-use-x86-fp
1339    Description:
1340    
1341    Now that x86-fast-fp seems to be working, I turned it back on again
1342    by default.  (Seems to work fine now, even with the FFI.)
1343    
1344    Other than that, I added some documentation about the FFI to
1345    src/ml-nlffigen/README and updated the FFI test examples in
1346    src/ml-nlffi-lib/Tests/*.
1347    
1348    ----------------------------------------------------------------------
1349    Name: Allen Leung
1350    Date: 2002/01/17
1351    Tag: leunga-20020117-x86-fast-fp-call
1352    Description:
1353    
1354       1. Fixed a problem with handling return fp values when x86's fast fp
1355          mode is turned on.
1356    
1357       2. Minor pretty printing fix for cellset.  Print %st(0) as %st(0) instead
1358          of %f32.
1359    
1360       3. Added a constructor INT32lit to the ast of MLRISC tools.
1361    
1362    ----------------------------------------------------------------------
1363    Name: Matthias Blume
1364    Date: 2002/01/16
1365    Tag: blume-20020116-ffifiddle
1366    Description:
1367    
1368    More fiddling with the FFI interface:
1369    
1370     - Make constness 'c instead of rw wherever possible.  This eliminates
1371       the need for certain explicit coercions.  (However, due to ML's
1372       value polymorphism, there will still be many cases where explicit
1373       coercions are necessary.  Phantom types are not the whole answer
1374       to modeling a subtyping relationship in ML.)
1375    
1376     - ro/rw coersions for pointers added.  (Avoids the detour through */&.)
1377    
1378     - "printf" test example added to src/ml-nlffi-lib/Tests.  (Demonstrates
1379       clumsy workaround for varargs problem.)
1380    
1381    ----------------------------------------------------------------------
1382    Name: Lal George
1383    Date: 2002/01/15
1384    Tag: <none>
1385    Description:
1386    
1387    1. Since COPY instructions are no longer native to the architecture,
1388       a generic functor can be used to implement the expandCopies function.
1389    
1390    2. Allowed EXPORT and IMPORT pseudo-op declarations to appear inside a
1391       TEXT segment.
1392    
1393    ----------------------------------------------------------------------
1394    Name: Matthias Blume
1395    Date: 2002/01/15
1396    Tag: blume-20020115-ffiupdates
1397    Description:
1398    
1399    1. Fix for bug resulting in single-precision float values being returned
1400       incorrectly from FFI calls.
1401    
1402    2. Small modifications to C FFI API:
1403    
1404        - memory-allocation routines return straight objects (no options)
1405          and raise an exception in out-of-memory situations
1406        - unsafe extensions to cast between function pointers and pointers
1407          from/to ints
1408        - added structure C_Debug as an alternative to structure C where
1409          pointer-dereferencing (|*| and |*!) always check for null-pointers
1410        - added open_lib' to DynLinkage;  open_lib' works like open_lib
1411          but also takes a (possibly empty) list of existing library handles
1412          that the current library depends on
1413    
1414    ----------------------------------------------------------------------
1415    Name: Matthias Blume
1416    Date: 2002/01/10
1417    Tag: blume-20020110-newffigen
1418    Description:
1419    
1420    1. Updates to portable graph code.
1421    
1422    2. Major update to ml-nlffigen and ml-nlffi-lib.  Things are much
1423       more scalable now so that even huge interfaces such as the one
1424       for GTK compile in finite time and space. :-)
1425       See src/ml-nlffigen/README for details on what's new.
1426    
1427    ----------------------------------------------------------------------
1428    Name: Lal George
1429    Date: 2001/01/09 14:31:35 EST 2002
1430    Tag: george-20011206-rm-native-copy
1431    Description:
1432    
1433            Removed the native COPY and FCOPY instructions
1434            from all the architectures and replaced it with the
1435            explicit COPY instruction from the previous commit.
1436    
1437            It is now possible to simplify many of the optimizations
1438            modules that manipulate copies. This has not been
1439            done in this change.
1440    
1441    ----------------------------------------------------------------------
1442    Name: Lal George
1443    Date: 2001/12/06 16:50:13 EST 2001
1444    Tag: george-20011206-mlrisc-instruction
1445    Description:
1446    
1447    Changed the representation of instructions from being fully abstract
1448    to being partially concrete. That is to say:
1449    
1450      from
1451            type instruction
1452    
1453      to
1454            type instr                              (* machine instruction *)
1455    
1456            datatype instruction =
1457                LIVE of {regs: C.cellset, spilled: C.cellset}
1458              | KILL of {regs: C.cellset, spilled: C.cellset}
1459              | COPYXXX of {k: CB.cellkind, dst: CB.cell list, src: CB.cell list}
1460              | ANNOTATION of {i: instruction, a: Annotations.annotation}
1461              | INSTR of instr
1462    
1463    This makes the handling of certain special instructions that appear on
1464    all architectures easier and uniform.
1465    
1466    LIVE and KILL say that a list of registers are live or killed at the
1467    program point where they appear. No spill code is generated when an
1468    element of the 'regs' field is spilled, but the register is moved to
1469    the 'spilled' (which is present, more for debugging than anything else).
1470    
1471    LIVE replaces the (now deprecated) DEFFREG instruction on the alpha.
1472    We used to generate:
1473    
1474            DEFFREG f1
1475            f1 := f2 + f3
1476            trapb
1477    
1478    but now generate:
1479    
1480            f1 := f2 + f3
1481            trapb
1482            LIVE {regs=[f1,f2,f3], spilled=[]}
1483    
1484    Furthermore, the DEFFREG (hack) required that all floating point instruction
1485    use all registers mentioned in the instruction. Therefore f1 := f2 + f3,
1486    defines f1 and uses [f1,f2,f3]! This hack is no longer required resulting
1487    in a cleaner alpha implementation. (Hopefully, intel will not get rid of
1488    this architecture).
1489    
1490    COPYXXX is intended to replace the parallel COPY and FCOPY  available on
1491    all the architectures. This will result in further simplification of the
1492    register allocator that must be aware of them for coalescing purposes, and
1493    will also simplify certain aspects of the machine description that provides
1494    callbacks related to parallel copies.
1495    
1496    ANNOTATION should be obvious, and now INSTR represents the honest to God
1497    machine instruction set!
1498    
1499    The <arch>/instructions/<arch>Instr.sml files define certain utility
1500    functions for making porting easier -- essentially converting upper case
1501    to lower case. All machine instructions (of type instr) are in upper case,
1502    and the lower case form generates an MLRISC instruction. For example on
1503    the alpha we have:
1504    
1505      datatype instr =
1506         LDA of {r:cell, b:cell, d:operand}
1507       | ...
1508    
1509      val lda : {r:cell, b:cell, d:operand} -> instruction
1510        ...
1511    
1512    where lda is just (INSTR o LDA), etc.
1513    
1514    ----------------------------------------------------------------------
1515    Name: Matthias Blume
1516    Date: 2001/11/22 21:40:00 EST
1517    Tag: Release_110_37
1518    Description:
1519    
1520    Release 110.37.  This time for real.
1521    
1522    ----------------------------------------------------------------------
1523    Name: Matthias Blume
1524    Date: 2001/11/21 16:35:00 EST
1525    Tag: blume-20011121-foot-in-mouth
1526    Description:
1527    
1528    Removed the "Release_110_37" tag because of a serious bug.
1529    This will be re-tagged once the bug is fixed.
1530    
1531    ----------------------------------------------------------------------
1532    Name: Matthias Blume
1533    Date: 2001/11/21 16:14:00 EST
1534    Tag: blume-20011121-forgottenfile
1535    Description:
1536    
1537    Forgot to add a file.  (Just a .tex-file -- part of
1538    the CM manual source.)
1539    
1540    ----------------------------------------------------------------------
1541    Name: Matthias Blume
1542    Date: 2001/11/21 16:10:00 EST
1543    Tag: blume-20011121-invalid_110_37
1544    Description:
1545    
1546    Note: I removed the original tag "Release_110_37" from this commit
1547          because we found a serious bug in all non-x86 backends.
1548          - Matthias
1549    
1550    1. Modifications to the SML/NJ code generator and to the runtime system
1551       so that code object name strings are directly inserted into code
1552       objects at code generation time.  The only business the runtime system
1553       has with this is now to read the name strings on occasions.
1554       (The encoding of the name string has also changed somewhat.)
1555    
1556    2. CM now implements a simple "set calculus" for specifying export lists.
1557       In particular, it is now possible to refer to the export lists of
1558       other libraries/groups/sources and form unions as well as differences.
1559       See the latest CM manual for details.
1560    
1561    3. An separate notion of "proxy" libraries has again be eliminated from
1562       CM's model.  (Proxy libraries are now simply a special case of using
1563       the export list calculus.)
1564    
1565    4. Some of the existing libraries now take advantage of the new set
1566       calculus.
1567       (Notice that not all libraries have been converted because some
1568       of the existing .cm-files are supposed to be backward compatible
1569       with 110.0.x.)
1570    
1571    5. Some cleanup in stand-alone programs.  (Don't use "exnMessage" -- use
1572       "General.exnMessage"!  The former relies on a certain hook to be
1573       initialized, and that often does not happen in the stand-alone case.)
1574    
1575    ----------------------------------------------------------------------
1576    Name: Lal George
1577    Date: 2001/11/21  13:56:18 EST
1578    Tag: george-2001121-pseudo-ops
1579    Description:
1580    
1581      Implemented a complete redesign of MLRISC pseudo-ops. Now there
1582      ought to never be any question of incompatabilities with
1583      pseudo-op syntax expected by host assemblers.
1584    
1585      For now, only modules supporting GAS syntax are implemented
1586      but more should follow, such as MASM, and vendor assembler
1587      syntax, e.g. IBM as, Sun as, etc.
1588    
1589    ----------------------------------------------------------------------
1590    Name: Matthias Blume
1591    Date: 2001/11/14 11:52:00 EST
1592    Tag: blume-20011114-srcname
1593    Description:
1594    
1595    1. Routed the name of the current source file to mlriscgen where it
1596       should be directly emitted into the code object.  (This last part
1597       is yet to be done.)
1598    
1599    2. Some cleanup of the pgraph code to make it match the proposal that
1600       I put out the other day.  (The proposal notwithstanding, things are
1601       still in flux here.)
1602    
1603    ----------------------------------------------------------------------
1604    Name: Lal George
1605    Date: 2001/11/14 09:44:04 EST
1606    Tag:
1607    Description:
1608    
1609      Fix for a backpatching bug reported by Allen.
1610    
1611      Because the boundary between short and long span-dependent
1612      instructions is +/- 128, there are an astounding number of
1613      span-dependent instructions whose size is over estimated.
1614    
1615      Allen came up with the idea of letting the size of span
1616      dependent instructions be non-monotonic, for a maxIter
1617      number of times, after which the size must be monotonically
1618      increasing.
1619    
1620      This table shows the number of span-dependent instructions
1621      whose size was over-estimated as a function of maxIter, for the
1622      file Parse/parse/ml.grm.sml:
1623    
1624         maxIter            # of instructions:
1625            10                      687
1626            20                      438
1627            30                      198
1628            40                        0
1629    
1630      In compiling the compiler, there is no significant difference in
1631      compilation speed between maxIter=10 and maxIter=40. Actually,
1632      my measurements showed that maxIter=40 was a tad faster than
1633      maxIter=10! Also 96% of the  files in the compiler reach a fix
1634      point within 13 iterations, so fixing maxIter at 40, while high,
1635      is okay.
1636    
1637    ----------------------------------------------------------------------
1638    Name: Matthias Blume
1639    Date: 2001/10/31 15:25:00 EST
1640    Tag: blume-20011031-pgraph
1641    Description:
1642    
1643    CKIT:
1644    * Changed the "Function" constructor of type Ast.ctype to carry optional
1645      argument identifiers.
1646    * Changed the return type of TypeUtil.getFunction accordingly.
1647    * Type equality ignores the argument names.
1648    * TypeUtil.composite tries to preserve argument names but gives up quickly
1649      if there is a mismatch.
1650    
1651    installation script:
1652    * attempts to use "curl" if available (unless "wget" is available as well)
1653    
1654    CM:
1655    * has an experimental implementation of "portable graphs" which I will
1656      soon propose as an implementation-independent library format
1657    * there are also new libraries $/pgraph.cm and $/pgraph-util.cm
1658    
1659    NLFFI-LIB:
1660    * some cleanup (all cosmetic)
1661    
1662    NLFFIGEN:
1663    * temporarily disabled the mechanism that suppresses ML output for
1664      C definitions whose identifiers start with an underscore character
1665    * generate val bindings for enum constants
1666    * user can request that only one style (light or heavy) is being used;
1667      default is to use both (command-line arguments: -heavy and -light)
1668    * fixed bug in handling of function types involving incomplete pointers
1669    * generate ML entry points that take record arguments (i.e., using
1670      named arguments) for C functions that have a prototype with named
1671      arguments
1672      (see changes to CKIT)
1673    
1674    ----------------------------------------------------------------------
1675    Name: Allen Leung
1676    Date: 2001/10/27 20:34:00 EDT
1677    Tag: leunga-20011027-x86-fast-fp-call
1678    Description:
1679    
1680       Fixed the bug described in blume-20010920-slowfp.
1681    
1682       The fix involves
1683          1. generating FCOPYs in FSTP in ia32-svid
1684          2. marking a CALL with the appropriate annotation
1685    
1686    ----------------------------------------------------------------------
1687    Name: Matthias Blume
1688    Date: 2001/10/16 11:32:00 EDT
1689    Tag: blume-20011016-netbsd
1690    Description:
1691    
1692    Underscore patch from Chris Richards (fixing problem with compiling
1693    runtime system under recent NetBSD).
1694    
1695    ----------------------------------------------------------------------
1696    Name: Allen Leung
1697    Date: 2001/10/12 17:18:32 EDT 2001
1698    Tag: leung-20011012-x86-printflowgraph
1699    Description:
1700    
1701    X86RA now uses a valid (instead of dummy) PrintFlowgraph module.
1702    
1703    ----------------------------------------------------------------------
1704    Name: Lal George
1705    Date: 2001/10/11 23:51:34 EDT
1706    Tag: george-20011011-too-many-instrs
1707    Description:
1708    
1709    The representation of a program point never expected to see more
1710    than 65536 instructions in a basic block!
1711    
1712    ----------------------------------------------------------------------
1713    Name: Lal George
1714    Date: 2001/10/09 09:41:37 EDT
1715    Tag: george-20011008-mlrisc-labels
1716    Description:
1717    
1718    Changed the machine description files to support printing of
1719    local and global labels in assembly code, based on host assembler
1720    conventions.
1721    
1722    ----------------------------------------------------------------------
1723    Name: Matthias Blume
1724    Date: 2001/09/25 15:25:00 EDT
1725    Tag: blume-20010925-exninfo
1726    Description:
1727    
1728    I provided a non-hook implementation of exnName (at the toplevel) and
1729    made the "dummy" implementation of exnMessage (at the toplevel) more
1730    useful: if nothing gets "hooked in", then at least you are going to
1731    see the exception name and a message indicating why you don't see more.
1732    
1733    [For the time being, programs that need exnMessage and want to use
1734    ml-build should either use General.exnMessage (strongly recommended) or
1735    refer to structure General at some other point so that CM sees a
1736    static dependency.]
1737    
1738    [Similar remarks go for "print" and "use":  If you want to use their
1739    functionality in stand-alone programs generated by ml-build, then use
1740    TextIO.output and Backend.Interact.useFile (from $smlnj/compiler.cm).]
1741    
1742    ----------------------------------------------------------------------
1743    Name: Matthias Blume
1744    Date: 2001/09/20 17:28:00 EDT
1745    Tag: blume-20010920-slowfp
1746    Description:
1747    
1748    Allen says that x86-fast-fp is not safe yet, so I turned it off again...
1749    
1750    ----------------------------------------------------------------------
1751    Name: Matthias Blume
1752    Date: 2001/09/20 17:20:00 EDT
1753    Tag: blume-20010920-canonicalpaths
1754    Description:
1755    
1756    0. Updated the BOOT file (something that I forgot to do earlier).
1757    
1758    1. Small internal change to CM so that it avoids "/../" in filenames
1759       as much as possible (but only where it is safe).
1760    
1761    2. Changed config/_run-sml (resulting in a changed bin/.run-sml) so
1762       that arguments that contain delimiters are passed through correctly.
1763       This change also means that all "special" arguments of the form
1764       @SMLxxx... must come first.
1765    
1766    3. Changed install script to put relative anchor names for tool commands
1767       into pathconfig.
1768    
1769    ----------------------------------------------------------------------
1770    Name: Matthias Blume
1771    Date: 2001/09/18 15:35:00 EDT
1772    Tag: blume-20010918-readme11036
1773    Description:
1774    
1775    Added README files.
1776    
1777    ----------------------------------------------------------------------
1778    Name: Matthias Blume
1779    Date: 2001/09/18 11:45:00 EDT
1780    Tag: Release_110_36 (retag)
1781    Description:
1782    
1783    Fixed mistake in config/preloads. Retagged as 110.36.
1784    
1785    ----------------------------------------------------------------------
1786    Name: Matthias Blume
1787    Date: 2001/09/18 09:40:00 EDT
1788    Tag: Release_110_36_orig (tag changed)
1789    Description:
1790    
1791    New version (110.36).  New bootfiles.
1792    
1793    ----------------------------------------------------------------------
1794    Name: Matthias Blume
1795    Date: 2001/09/14 16:15:00 EDT
1796    Tag: blume-20010914-x86fastfp
1797    Description:
1798    
1799    John committed some changes that Allen made, in particular a (hopefully)
1800    correctly working version of the x86-fp module.
1801    
1802    I changed the default setting of the Control.MLRISC.getFlag "x86-fast-fp"
1803    flag to "true".  Everything seems to compile to a fixpoint ok, and
1804    "mandelbrot" speeds up by about 15%.
1805    
1806    ----------------------------------------------------------------------
1807    Name: Matthias Blume
1808    Date: 2001/09/13 11:20:00 EDT
1809    Tag: blume-20010913-minimal
1810    Description:
1811    
1812    1. Stefan Monnier's patch to fix a miscompilation problem that
1813       was brought to light by John Reppy's work on Moby.
1814    
1815    2. Implemented a minimal "structure Compiler" that contains just
1816       "version" and "architecture".  The minimal version will be
1817       available when the full version is not.  This is for backward-
1818       compatibility with code that wants to test Compiler.version.
1819    
1820    ----------------------------------------------------------------------
1821    Name: Matthias Blume
1822    Date: 2001/08/28 14:03:00 EDT
1823    Tag: blume-20010828-ml-lex
1824    Description:
1825    
1826    Fix for bug 1581, received from Neophytos Michael.
1827    
1828    ----------------------------------------------------------------------
1829    Name: Matthias Blume
1830    Date: 2001/08/27 11:20:00 EDT
1831    Tag: blume-20010827-readme11035
1832    Description:
1833    
1834    Fleshed out the README file for 110.35.
1835    
1836    ----------------------------------------------------------------------
1837    Name: Matthias Blume
1838    Date: 2001/08/24 17:10:00 EDT
1839    Tag: Release_110_35
1840    Description:
1841    
1842    New version number (110.35).  New bootfiles.
1843    
1844    ----------------------------------------------------------------------
1845    Name: Lal George
1846    Date: 2001/08/24 13:47:18 EDT 2001
1847    Tag: george-20010824-MLRISC-graphs
1848    Description:
1849    
1850     removed clusters from MLRISC completely and replaced with graphs.
1851    
1852    ----------------------------------------------------------------------
1853    Name: Matthias Blume
1854    Date: 2001/08/23 17:50:00 EDT
1855    Tag: blume-20010823-toplevel
1856    Description:
1857    
1858    - some reorganization of the code that implements various kinds of
1859      environments in the compiler (static, dynamic, symbolic, combined)
1860    - re-implemented the EnvRef module so that evalStream works properly
1861      (if the stream contains references to "use", "CM.make", etc.)
1862    - cleaned up evalloop.sml and interact.sml (but they need more cleaning)
1863    
1864    ----------------------------------------------------------------------
1865    Name: Matthias Blume
1866    Date: 2001/08/20 15:50 EDT
1867    Tag: blume20010820-slipup
1868    Description:
1869    
1870    I forgot to commit a few files.  Here they are...
1871    
1872    ----------------------------------------------------------------------
1873    Name: Matthias Blume
1874  Date: 2001/08/20 15:35:00 EDT  Date: 2001/08/20 15:35:00 EDT
1875  Tag: blume-20010820-debugprof  Tag: blume-20010820-debugprof
1876  Description:  Description:
# Line 2887  Line 4744 
4744       elaborator). There were a lot of changes during my "linkpath" trials       elaborator). There were a lot of changes during my "linkpath" trials
4745       that could have been reverted to their original state but weren't.       that could have been reverted to their original state but weren't.
4746       Please, don't be too harsh on me for messing with this code a bit more       Please, don't be too harsh on me for messing with this code a bit more
4747       than what was strictly necessary...  (I _did_ resist the tempation       than what was strictly necessary...  (I _did_ resist the temptation
4748       of doing any "global reformatting" to avoid an untimely death at       of doing any "global reformatting" to avoid an untimely death at
4749       Dave's hands. :)       Dave's hands. :)
4750    

Legend:
Removed from v.903  
changed lines
  Added in v.1179

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0