Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1066, Thu Feb 14 16:50:02 2002 UTC revision 1138, Wed Mar 13 03:03:50 2002 UTC
# Line 11  Line 11 
11  Date: yyyy/mm/dd  Date: yyyy/mm/dd
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14    ----------------------------------------------------------------------
15    Name: Matthias Blume
16    Date: 2002/03/12 22:03:00 EST
17    Tag: blume-20020312-url
18    Description:
19    
20    Forgot to update BOOT and srcarchiveurl.
21    
22    ----------------------------------------------------------------------
23    Name: Matthias Blume
24    Date: 2002/03/12 17:30:00 EST
25    Tag: blume-20020312-version110392
26    Description:
27    
28    Yet another version number bump (because of small changes to the
29    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
30    
31    Changes:
32    
33      The new pid generation scheme described a few weeks ago was overly
34      complicated.  I implemented a new mechanism that is simpler and
35      provides a bit more "stability":  Once CM has seen a compilation
36      unit, it keeps its identity constant (as long as you do not delete
37      those crucial CM/GUID/* files).  This means that when you change
38      and interface, compiler, then go back to the old interface, and
39      compile again, you arrive at the original pid.
40    
41      There now also is a mechanism that instructs CM to use the plain
42      environment hash as a module's pid (effectively making its GUID
43      the empty string).  For this, "noguid" must be specified as an
44      option to the .sml file in question within its .cm file.
45      This is most useful for code that is being generated by tools such
46      as ml-nlffigen (because during development programmers tend to
47      erase the tool's entire output directory tree including CM's cached
48      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
49      revert to the old, broken behavior of SML/NJ, but in specific cases
50      where there is no danger of interface confusion, its use is ok
51      (I think).
52    
53      ml-nlffigen by default generates "noguid" annotations.  They can be
54      turned off by specifying -guid in its command line.
55    
56    ----------------------------------------------------------------------
57    Name: Lal George
58    Date: 2002/03/12 12 14:42:36 EST
59    Tag: george-20020312-frequency-computation
60    Description:
61    
62    Integrated jump chaining and static block frequency into the
63    compiler. More details and numbers later.
64    
65    ----------------------------------------------------------------------
66    Name: Lal George
67    Date: 2002/03/11 11 22:38:53 EST
68    Tag: george-20020311-jump-chain-elim
69    Description:
70    
71    Tested the jump chain elimination on all architectures (except the
72    hppa).  This is on by default right now and is profitable for the
73    alpha and x86, however, it may not be profitable for the sparc and ppc
74    when compiling the compiler.
75    
76    The gc test will typically jump to a label at the end of the cluster,
77    where there is another jump to an external cluster containing the actual
78    code to invoke gc. This is to allow factoring of common gc invocation
79    sequences. That is to say, we generate:
80    
81            f:
82               testgc
83               ja   L1      % jump if above to L1
84    
85            L1:
86               jmp L2
87    
88    
89    After jump chain elimination the 'ja L1' instructions is converted to
90    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
91    up being implemented in their long form (if L2 is far away) using:
92    
93            jbe     L3      % jump if below or equal to L3
94            jmp     L2
95         L3:
96            ...
97    
98    
99    For large compilation units L2  may be far away.
100    
101    
102    ----------------------------------------------------------------------
103    Name: Matthias Blume
104    Date: 2002/03/11 13:30:00 EST
105    Tag: blume-20020311-mltreeeval
106    Description:
107    
108    A functor parameter was missing.
109    
110    ----------------------------------------------------------------------
111    Name: Allen Leung
112    Date: 2002/03/11 10:30:00 EST
113    Tag: leunga-20020310-runtime-string0
114    Description:
115    
116       The representation of the empty string now points to a
117    legal null terminated C string instead of unit.  It is now possible
118    to convert an ML string into C string with InlineT.CharVector.getData.
119    This compiles into one single machine instruction.
120    
121    ----------------------------------------------------------------------
122    Name: Allen Leung
123    Date: 2002/03/10 23:55:00 EST
124    Tag: leunga-20020310-x86-call
125    Description:
126    
127       Added machine generation for CALL instruction (relative displacement mode)
128    
129    ----------------------------------------------------------------------
130    Name: Matthias Blume
131    Date: 2002/03/08 16:05:00
132    Tag: blume-20020308-entrypoints
133    Description:
134    
135    Version number bumped to 110.39.1.  NEW BOOTFILES!
136    
137    Entrypoints: non-zero offset into a code object where execution should begin.
138    
139    - Added the notion of an entrypoint to CodeObj.
140    - Added reading/writing of entrypoint info to Binfile.
141    - Made runtime system bootloader aware of entrypoints.
142    - Use the address of the label of the first function given to mlriscGen
143      as the entrypoint.  This address is currently always 0, but it will
144      not be 0 once we turn on block placement.
145    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
146      with entry points) from mlriscGen.
147    
148    ----------------------------------------------------------------------
149    Name: Allen Leung
150    Date: 2002/03/07 20:45:00 EST
151    Tag: leunga-20020307-x86-cmov
152    Description:
153    
154       Bug fixes for CMOVcc on x86.
155    
156       1. Added machine code generation for CMOVcc
157       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
158       3. CMOVcc cannot have an immediate operand as argument.
159    
160    ----------------------------------------------------------------------
161    Name: Matthias Blume
162    Date: 2002/03/07 16:15:00 EST
163    Tag: blume-20020307-controls
164    Description:
165    
166    This is a very large but mostly boring patch which makes (almost)
167    every tuneable compiler knob (i.e., pretty much everything under
168    Control.* plus a few other things) configurable via both the command
169    line and environment variables in the style CM did its configuration
170    until now.
171    
172    Try starting sml with '-h' (or, if you are brave, '-H')
173    
174    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
175    implements the underlying generic mechanism.
176    
177    The interface to some of the existing such facilities has changed somewhat.
178    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
179    (The getFoo interface is still there for backward-compatibility, but its
180    use is deprecated.)
181    
182    The ml-build script passes -Cxxx=yyy command-line arguments through so
183    that one can now twiddle the compiler settings when using this "batch"
184    compiler.
185    
186    TODO items:
187    
188    We should go through and throw out all controls that are no longer
189    connected to anything.  Moreover, we should go through and provide
190    meaningful (and correct!) documentation strings for those controls
191    that still are connected.
192    
193    Currently, multiple calls to Controls.new are accepted (only the first
194    has any effect).  Eventually we should make sure that every control
195    is being made (via Controls.new) exactly once.  Future access can then
196    be done using Controls.acc.
197    
198    Finally, it would probably be a good idea to use the getter-setter
199    interface to controls rather than ref cells.  For the time being, both
200    styles are provided by the Controls module, but getter-setter pairs are
201    better if thread-safety is of any concern because they can be wrapped.
202    
203    *****************************************
204    
205    One bug fix: The function blockPlacement in three of the MLRISC
206    backpatch files used to be hard-wired to one of two possibilities at
207    link time (according to the value of the placementFlag).  But (I
208    think) it should rather sense the flag every time.
209    
210    *****************************************
211    
212    Other assorted changes (by other people who did not supply a HISTORY entry):
213    
214    1. the cross-module inliner now works much better (Monnier)
215    2. representation of weights, frequencies, and probabilities in MLRISC
216       changed in preparation of using those for weighted block placement
217       (Reppy, George)
218    
219    ----------------------------------------------------------------------
220    Name: Lal George
221    Date: 2002/03/07 14:44:24 EST 2002
222    Tag: george-20020307-weighted-block-placement
223    
224    Tested the weighted block placement optimization on all architectures
225    (except the hppa) using AMPL to generate the block and edge frequencies.
226    Changes were required in the machine properties to correctly
227    categorize trap instructions. There is an MLRISC flag
228    "weighted-block-placement" that can be used to enable weighted block
229    placement, but this will be ineffective without block/edge
230    frequencies (coming soon).
231    
232    
233    ----------------------------------------------------------------------
234    Name: Lal George
235    Date: 2002/03/05 17:24:48 EST
236    Tag: george-20020305-linkage-cluster
237    
238    In order to support the block placement optimization, a new cluster
239    is generated as the very first cluster (called the linkage cluster).
240    It contains a single jump to the 'real' entry point for the compilation
241    unit. Block placement has no effect on the linkage cluster itself, but
242    all the other clusters  have full freedom in the manner in which they
243    reorder blocks or functions.
244    
245    On the x86 the typical linkage code that is generated is:
246       ----------------------
247            .align 2
248       L0:
249            addl    $L1-L0, 72(%esp)
250            jmp     L1
251    
252    
253            .align  2
254       L1:
255       ----------------------
256    
257    72(%esp) is the memory location for the stdlink register. This
258    must contain the address of the CPS function being called. In the
259    above example, it contains the address of  L0; before
260    calling L1 (the real entry point for the compilation unit), it
261    must contain the address for L1, and hence
262    
263            addl $L1-L0, 72(%esp)
264    
265    I have tested this on all architectures except the hppa.The increase
266    in code size is of course negligible
267    
268    ----------------------------------------------------------------------
269    Name: Allen Leung
270    Date: 2002/03/03 13:20:00 EST
271    Tag: leunga-20020303-mlrisc-tools
272    
273      Added #[ ... ] expressions to mlrisc tools
274    
275    ----------------------------------------------------------------------
276    Name: Matthias Blume
277    Date: 2002/02/27 12:29:00 EST
278    Tag: blume-20020227-cdebug
279    Description:
280    
281    - made types in structure C and C_Debug to be equal
282    - got rid of code duplication (c-int.sml vs. c-int-debug.sml)
283    - there no longer is a C_Int_Debug (C_Debug is directly derived from C)
284    
285    ----------------------------------------------------------------------
286    Name: Matthias Blume
287    Date: 2002/02/26 12:00:00 EST
288    Tag: blume-20020226-ffi
289    Description:
290    
291    1. Fixed a minor bug in CM's "noweb" tool:
292       If numbering is turned off, then truly don't number (i.e., do not
293       supply the -L option to noweb).  The previous behavior was to supply
294       -L'' -- which caused noweb to use the "default" line numbering scheme.
295       Thanks to Chris Richards for pointing this out (and supplying the fix).
296    
297    2. Once again, I reworked some aspects of the FFI:
298    
299       A. The incomplete/complete type business:
300    
301       - Signatures POINTER_TO_INCOMPLETE_TYPE and accompanying functors are
302         gone!
303       - ML types representing an incomplete type are now *equal* to
304         ML types representing their corresponding complete types (just like
305         in C).  This is still safe because ml-nlffigen will not generate
306         RTTI for incomplete types, nor will it generate functions that
307         require access to such RTTI.   But when ML code generated from both
308         incomplete and complete versions of the C type meet, the ML types
309         are trivially interoperable.
310    
311         NOTE:  These changes restore the full generality of the translation
312         (which was previously lost when I eliminated functorization)!
313    
314       B. Enum types:
315    
316       - Structure C now has a type constructor "enum" that is similar to
317         how the "su" constructor works.  However, "enum" is not a phantom
318         type because each "T enum" has values (and is isomorphic to
319         MLRep.Signed.int).
320       - There are generic access operations for enum objects (using
321         MLRep.Signed.int).
322       - ml-nlffigen will generate a structure E_foo for each "enum foo".
323         * The structure contains the definition of type "mlrep" (the ML-side
324         representation type of the enum).  Normally, mlrep is the same
325         as "MLRep.Signed.int", but if ml-nlffigen was invoked with "-ec",
326         then mlrep will be defined as a datatype -- thus facilitating
327         pattern matching on mlrep values.
328         ("-ec" will be suppressed if there are duplicate values in an
329          enumeration.)
330         * Constructors ("-ec") or values (no "-ec") e_xxx of type mlrep
331         will be generated for each C enum constant xxx.
332         * Conversion functions m2i and i2m convert between mlrep and
333         MLRep.Signed.int.  (Without "-ec", these functions are identities.)
334         * Coversion functions c and ml convert between mlrep and "tag enum".
335         * Access functions (get/set) fetch and store mlrep values.
336       - By default (unless ml-nlffigen was invoked with "-nocollect"), unnamed
337         enumerations are merged into one single enumeration represented by
338         structure E_'.
339    
340    ----------------------------------------------------------------------
341    Name: Allen Leung
342    Date: 2002/02/25 04:45:00 EST
343    Tag: leunga-20020225-cps-spill
344    
345    This is a new implementation of the CPS spill phase.
346    The new phase is in the new file compiler/CodeGen/cpscompile/spill-new.sml
347    In case of problems, replace it with the old file spill.sml
348    
349    The current compiler runs into some serious performance problems when
350    constructing a large record.  This can happen when we try to compile a
351    structure with many items.  Even a very simple structure like the following
352    makes the compiler slow down.
353    
354        structure Foo = struct
355           val x_1 = 0w1 : Word32.int
356           val x_2 = 0w2 : Word32.int
357           val x_3 = 0w3 : Word32.int
358           ...
359           val x_N = 0wN : Word32.int
360        end
361    
362    The following table shows the compile time, from N=1000 to N=4000,
363    with the old compiler:
364    
365    N
366    1000   CPS 100 spill                           0.04u  0.00s  0.00g
367           MLRISC ra                               0.06u  0.00s  0.05g
368              (spills = 0 reloads = 0)
369           TOTAL                                   0.63u  0.07s  0.21g
370    
371    1100   CPS 100 spill                           8.25u  0.32s  0.64g
372           MLRISC ra                               5.68u  0.59s  3.93g
373              (spills = 0 reloads = 0)
374           TOTAL                                   14.71u  0.99s  4.81g
375    
376    1500   CPS 100 spill                           58.55u  2.34s  1.74g
377           MLRISC ra                               5.54u  0.65s  3.91g
378              (spills = 543 reloads = 1082)
379           TOTAL                                   65.40u  3.13s  6.00g
380    
381    2000   CPS 100 spill                           126.69u  4.84s  3.08g
382           MLRISC ra                               0.80u  0.10s  0.55g
383              (spills = 42 reloads = 84)
384           TOTAL                                   129.42u  5.10s  4.13g
385    
386    3000   CPS 100 spill                           675.59u  19.03s  11.64g
387           MLRISC ra                               2.69u  0.27s  1.38g
388              (spills = 62 reloads = 124)
389           TOTAL                                   682.48u  19.61s  13.99g
390    
391    4000   CPS 100 spill                           2362.82u  56.28s  43.60g
392           MLRISC ra                               4.96u  0.27s  2.72g
393              (spills = 85 reloads = 170)
394           TOTAL                                   2375.26u  57.21s  48.00g
395    
396    As you can see the old cps spill module suffers from some serious
397    performance problem.  But since I cannot decipher the old code fully,
398    instead of patching the problems up, I'm reimplementing it
399    with a different algorithm.  The new code is more modular,
400    smaller when compiled, and substantially faster
401    (O(n log n) time and O(n) space).  Timing of the new spill module:
402    
403    4000  CPS 100 spill                           0.02u  0.00s  0.00g
404          MLRISC ra                               0.25u  0.02s  0.15g
405             (spills=1 reloads=3)
406          TOTAL                                   7.74u  0.34s  1.62g
407    
408    Implementation details:
409    
410    As far as I can tell, the purpose of the CPS spill module is to make sure the
411    number of live variables at any program point (the bandwidth)
412    does not exceed a certain limit, which is determined by the
413    size of the spill area.
414    
415    When the bandwidth is too large, we decrease the register pressure by
416    packing live variables into spill records.  How we achieve this is
417    completely different than what we did in the old code.
418    
419    First, there is something about the MLRiscGen code generator
420    that we should be aware of:
421    
422    o MLRiscGen performs code motion!
423    
424       In particular, it will move floating point computations and
425       address computations involving only the heap pointer to
426       their use sites (if there is only a single use).
427       What this means is that if we have a CPS record construction
428       statement
429    
430           RECORD(k,vl,w,e)
431    
432       we should never count the new record address w as live if w
433       has only one use (which is often the case).
434    
435       We should do something similar to floating point, but the transformation
436       there is much more complex, so I won't deal with that.
437    
438    Secondly, there are now two new cps primops at our disposal:
439    
440     1. rawrecord of record_kind option
441        This pure operator allocates some uninitialized storage from the heap.
442        There are two forms:
443    
444         rawrecord NONE [INT n]  allocates a tagless record of length n
445         rawrecord (SOME rk) [INT n] allocates a tagged record of length n
446                                     and initializes the tag.
447    
448     2. rawupdate of cty
449          rawupdate cty (v,i,x)
450          Assigns to x to the ith component of record v.
451          The storelist is not updated.
452    
453    We use these new primops for both spilling and increment record construction.
454    
455     1. Spilling.
456    
457        This is implemented with a linear scan algorithm (but generalized
458        to trees).  The algorithm will create a single spill record at the
459        beginning of the cps function and use rawupdate to spill to it,
460        and SELECT or SELp to reload from it.  So both spills and reloads
461        are fine-grain operations.  In contrast, in the old algorithm
462        "spills" have to be bundled together in records.
463    
464        Ideally, we should sink the spill record construction to where
465        it is needed.  We can even split the spill record into multiple ones
466        at the places where they are needed.  But CPS is not a good
467        representation for global code motion, so I'll keep it simple and
468        am not attempting this.
469    
470     2. Incremental record construction (aka record splitting).
471    
472        Long records with many component values which are simulatenously live
473        (recall that single use record addresses are not considered to
474         be live) are constructed with rawrecord and rawupdate.
475        We allocate space on the heap with rawrecord first, then gradually
476        fill it in with rawupdate.  This is the technique suggested to me
477        by Matthias.
478    
479        Some restrictions on when this is applicable:
480        1. It is not a VECTOR record.  The code generator currently does not handle
481           this case. VECTOR record uses double indirection like arrays.
482        2. All the record component values are defined in the same "basic block"
483           as the record constructor.  This is to prevent speculative
484           record construction.
485    
486    ----------------------------------------------------------------------
487    Name: Allen Leung
488    Date: 2002/02/22 01:02:00 EST
489    Tag: leunga-20020222-mlrisc-tools
490    
491    Minor bug fixes in the parser and rewriter
492    
493    ----------------------------------------------------------------------
494    Name: Allen Leung
495    Date: 2002/02/21 20:20:00 EST
496    Tag: leunga-20020221-peephole
497    
498    Regenerated the peephole files.  Some contained typos in the specification
499    and some didn't compile because of pretty printing bugs in the old version
500    of 'nowhere'.
501    
502    ----------------------------------------------------------------------
503    Name: Allen Leung
504    Date: 2002/02/19 20:20:00 EST
505    Tag: leunga-20020219-mlrisc-tools
506    Description:
507    
508       Minor bug fixes to the mlrisc-tools library:
509    
510       1.  Fixed up parsing colon suffixed keywords
511       2.  Added the ability to shut the error messages up
512       3.  Reimplemented the pretty printer and fixed up/improved
513           the pretty printing of handle and -> types.
514       4.  Fixed up generation of literal symbols in the nowhere tool.
515       5.  Added some SML keywords to to sml.sty
516    
517    ----------------------------------------------------------------------
518    Name: Matthias Blume
519    Date: 2002/02/19 16:20:00 EST
520    Tag: blume-20020219-cmffi
521    Description:
522    
523    A wild mix of changes, some minor, some major:
524    
525    * All C FFI-related libraries are now anchored under $c:
526        $/c.cm      --> $c/c.cm
527        $/c-int.cm  --> $c/internals/c-int.cm
528        $/memory.cm --> $c/memory/memory.cm
529    
530    * "make" tool (in CM) now treats its argument pathname slightly
531      differently:
532        1. If the native expansion is an absolute name, then before invoking
533           the "make" command on it, CM will apply OS.Path.mkRelative
534           (with relativeTo = OS.FileSys.getDir()) to it.
535        2. The argument will be passed through to subsequent phases of CM
536           processing without "going native".  In particular, if the argument
537           was an anchored path, then "make" will not lose track of that anchor.
538    
539    * Compiler backends now "know" their respective C calling conventions
540      instead of having to be told about it by ml-nlffigen.  This relieves
541      ml-nlffigen from one of its burdens.
542    
543    * The X86Backend has been split into X86CCallBackend and X86StdCallBackend.
544    
545    * Export C_DEBUG and C_Debug from $c/c.cm.
546    
547    * C type encoding in ml-nlffi-lib has been improved to model the conceptual
548      subtyping relationship between incomplete pointers and their complete
549      counterparts.  For this, ('t, 'c) ptr has been changed to 'o ptr --
550      with the convention of instantiating 'o with ('t, 'c) obj whenever
551      the pointer target type is complete.  In the incomplete case, 'o
552      will be instantiated with some "'c iobj" -- a type obtained by
553      using one of the functors PointerToIncompleteType or PointerToCompleteType.
554    
555      Operations that work on both incomplete and complete pointer types are
556      typed as taking an 'o ptr while operations that require the target to
557      be known are typed as taking some ('t, 'c) obj ptr.
558    
559      voidptr is now a bit "more concrete", namely "type voidptr = void ptr'"
560      where void is an eqtype without any values.  This makes it possible
561      to work on voidptr values using functions meant to operate on light
562      incomplete pointers.
563    
564    * As a result of the above, signature POINTER_TO_INCOMPLETE_TYPE has
565      been vastly simplified.
566    
567    ----------------------------------------------------------------------
568    Name: Matthias Blume
569    Date: 2002/02/19 10:48:00 EST
570    Tag: blume-20020219-pqfix
571    Description:
572    
573    Applied Chris Okasaki's bug fix for priority queues.
574    
575    ----------------------------------------------------------------------
576    Name: Matthias Blume
577    Date: 2002/02/15 17:05:00
578    Tag: Release_110_39
579    Description:
580    
581    Last-minute retagging is becoming a tradition... :-(
582    
583    This is the working release 110.39.
584    
585    ----------------------------------------------------------------------
586    Name: Matthias Blume
587    Date: 2002/02/15 16:00:00 EST
588    Tag: Release_110_39-orig
589    Description:
590    
591    Working release 110.39.  New bootfiles.
592    
593    (Update: There was a small bug in the installer so it wouldn't work
594    with all shells.  So I retagged. -Matthias)
595    
596    ----------------------------------------------------------------------
597    Name: Matthias Blume
598    Date: 2002/02/15 14:17:00 EST
599    Tag: blume-20020215-showbindings
600    Description:
601    
602    Added EnvRef.listBoundSymbols and CM.State.showBindings.  Especially
603    the latter can be useful for exploring what bindings are available at
604    the interactive prompt.  (The first function returns only the list
605    of symbols that are really bound, the second prints those but also the
606    ones that CM's autoloading mechanism knows about.)
607    
608    ----------------------------------------------------------------------
609    Name: Matthias Blume
610    Date: 2002/02/15 12:08:00 EST
611    Tag: blume-20020215-iptrs
612    Description:
613    
614    Two improvements to ml-nlffigen:
615    
616      1. Write files only if they do not exist or if their current contents
617         do not coincide with what's being written.  (That is, avoid messing
618         with the time stamps unless absolutely necessary.)
619    
620      2. Implement a "repository" mechanism for generated files related
621         to "incomplete pointer types".   See the README file for details.
622    
623  ----------------------------------------------------------------------  ----------------------------------------------------------------------
624  Name: Matthias Blume  Name: Matthias Blume

Legend:
Removed from v.1066  
changed lines
  Added in v.1138

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0