Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 572, Thu Mar 9 02:43:06 2000 UTC revision 591, Mon Apr 3 01:19:20 2000 UTC
# Line 11  Line 11 
11  Date:  Date:
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14    ----------------------------------------------------------------------
15    Name: Allen Leung
16    Date: 2000/04/02 21:17:00 EST
17    Tag: leunga-20000402-mltree
18    Description:
19    
20       1. Renamed the constructor CALL in MLTREE by popular demand.
21       2. Added a bunch of files from my repository.  These are currently
22          used by other non-SMLNJ backends.
23    
24    ----------------------------------------------------------------------
25    Name: Allen Leung
26    Date: 2000/03/31 21:15:00 EST
27    Tag: leunga-20000331-aliasing
28    Description:
29    
30    This update contains a rewritten (and hopefully more correct) module
31    for extracting aliasing information from CPS.
32    
33       To turn on this feature:
34    
35            Compiler.Control.CG.memDisambiguate := true
36    
37       To pretty print the region information with assembly
38    
39           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
40    
41       To control how many levels of aliasing information are printed, use:
42    
43           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
44    
45       The default of n is 3.
46    
47    ----------------------------------------------------------------------
48    Name: David MacQueen
49    Date: 2000/03/31 11:15:00 EST
50    Tag: dbm-20000331-runtime_fix
51    Description:
52    
53    This update contains:
54    
55    1. runtime/c-lib/c-libraries.c
56       includes added in revision 1.2 caused compilation errors on hppa-hpux
57    
58    2. fix for bug 1556
59       system/Basis/Implementation/NJ/internal-signals.sml
60    
61    ----------------------------------------------------------------------
62    Name: Matthias Blume
63    Date: 2000/03/31 18:00:00 JST
64    Tag: blume_main_v110p26p2_1
65    Description:
66    
67    This update contains:
68    
69    1. A small change to CM's handling of stable libraries:
70       CM now maintains one "global" modmap that is used for all stable
71       libraries.  The use of such a global modmap maximizes sharing and
72       minimizes the need for re-traversing parts of environments during
73       modmap construction.  (However, this has minor impact since modmap
74       construction seems to account for just one percent or less of total
75       compile time.)
76    
77    2. I added a "genmap" phase to the statistics.  This is where I got the
78       "one percent" number (see above).
79    
80    3. CM's new tool parameter mechanism just became _even_ better. :)
81       - The parser understands named parameters and recursive options.
82       - The "make" and "shell" tools use these new features.
83         (This makes it a lot easier to cascade these tools.)
84       - There is a small syntax change: named parameters use a
85    
86           <name> : ( <option> ... )            or
87           <name> : <string>
88    
89         syntax.  Previously, named parameters were implemented in an
90         ad-hoc fashion by each tool individually (by parsing strings)
91         and had the form
92    
93           <name>=<string>
94    
95       See the CM manual for a full description of these issues.
96    
97    ----------------------------------------------------------------------
98    Name: Matthias Blume
99    Date: 2000/03/30 18:00:00 JST
100    Tag: blume_main_v110p26p2_0
101    Description:
102    
103    !!!!! WARNING !!!!!!
104    !!  New binfiles  !!
105    !!!!!!!!!!!!!!!!!!!!
106    
107    This update contains:
108    
109    1. Moderate changes to CM:
110    
111       - Changes to CM's tools mechanism.  In particular, it is now possible
112       to have tools that accept additional "command line" parameters
113       (specified in the .cm file at each instance where the tool's class is
114       used).
115    
116       This was done to accomodate the new "make" and "shell" tools which
117       facilitate fairly seemless hookup to portions of code managed using
118       Makefiles or Shell scripts.
119    
120       There are no classes "shared" or "private" anymore.  Instead, the
121       sharing annotation is now a parameter to the "sml" class.
122    
123       There is a bit of generic machinery for implementing one's own
124       tools that accept command-line parameters.  However, I am not yet fully
125       satisfied with that part, so expect changes here in the future.
126    
127       All existing tools are described in the CM manual.
128    
129       - Slightly better error handling.  (CM now surpresses many followup
130       error messages that tended to be more annoying than helpful.)
131    
132    2. Major changes to the compiler's static environment data structures.
133    
134       - no CMStaticEnv anymore.
135            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
136              but it is called Environment), no conversions between different
137              kinds of static environments
138    
139       - There is still a notion of a "modmap", but such modmaps are generated
140         on demand at the time when they are needed.  This sounds slow, but I
141         sped up the code that generates modmaps enough for this not to lead to
142         a slowdown of the compiler (at least I didn't detect any).
143    
144       - To facilitate rapid modmap generation, static environments now
145         contain an (optional) "modtree" structure.  Modtree annotations are
146         constructed by the unpickler during unpickling.  (This means that
147         the elaborator does not have to worry about modtrees at all.)
148         Modtrees have the advantage that they are compositional in the same
149         way as the environment data structure itself is compositional.
150         As a result, modtrees never hang on to parts of an environment that
151         has already been rendered "stale" by filtering or rebinding.
152    
153       - I went through many, many trials and errors before arriving at the
154         current solution.  (The initial idea of "linkpaths" did not work.)
155         But the result of all this is that I have touched a lot of files that
156         depend on the "modules" and "types" data structures (most of the
157         elaborator). There were a lot of changes during my "linkpath" trials
158         that could have been reverted to their original state but weren't.
159         Please, don't be too harsh on me for messing with this code a bit more
160         than what was strictly necessary...  (I _did_ resist the tempation
161         of doing any "global reformatting" to avoid an untimely death at
162         Dave's hands. :)
163    
164       - One positive aspect of the previous point:  At least I made sure that
165         all files that I touched now compile without warnings (other than
166         "polyEqual").
167    
168       - compiler now tends to run "leaner" (i.e., ties up less memory in
169         redundant modmaps)
170    
171    ----------------------------------------------------------------------
172    Name: Allen Leung
173    Date: 2000/03/29 18:00:00
174    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
175    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
176    Description:
177    
178       This update contains *MAJOR* changes to the way code is generated from CPS
179    in the module mlriscGen, and in various backend modules.
180    
181    CHANGES
182    =======
183    
184    1. MLRiscGen: forward propagation fix.
185    
186       There was a bug in forward propagation introduced at about the same time
187       as the MLRISC x86 backend, which prohibits coalescing to be
188       performed effectively in loops.
189    
190       Effect: speed up of loops in RISC architectures.
191               By itself, this actually slowed down certain benchmarks on the x86.
192    
193    2. MLRiscGen:  forward propagating addresses from consing.
194    
195       I've changed the way consing code is generated.  Basically I separated
196       out the initialization part:
197    
198            store tag,   offset(allocptr)
199            store elem1, offset+4(allocptr)
200            store elem2, offset+8(allocptr)
201            ...
202            store elemn, offset+4n(allocptr)
203    
204       and the address computation part:
205    
206            celladdr <- offset+4+alloctpr
207    
208       and move the address computation part
209    
210       Effect:  register pressure is generally lower as a result.  This
211                makes compilation of certain expressions much faster, such as
212                long lists with non-trivial elements.
213    
214                 [(0,0), (0,0), .... (0,0)]
215    
216    3. MLRiscGen: base pointer elimination.
217    
218        As part of the linkage mechanism, we generate the sequence:
219    
220         L:  ...  <- start of the code fragment
221    
222         L1:
223             base pointer <- linkreg - L1 + L
224    
225         The base pointer was then used for computing relocatable addresses
226       in the code fragment.  Frequently (such as in lots of continuations)
227       this is not needed.  We now eliminate this sequence whenever possible.
228    
229         For compile time efficiency, I'm using a very stupid local heuristic.
230       But in general, this should be done as a control flow analysis.
231    
232       Effect:  Smaller code size.  Speed up of most programs.
233    
234    4. Hppa back end
235    
236         Long jumps in span dependence resolution used to depend on the existence
237      of the base pointer.
238    
239         A jump to a long label L was expanded into the following sequence:
240    
241          LDIL %hi(L-8192), %r29
242          LDO  %lo(L-8192)(%r29), %r29
243          ADD  %r29, baseptr, %r29
244          BV,n %r0(%r29)
245    
246         In the presence of change (3) above, this will not work.  I've changed
247       it so that the following sequence of instructions are generated, which
248       doesn't mention the base pointer at all:
249    
250             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
251        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
252             BV,n  %r0(%r29)          /* Jump */
253    
254    5. Alpha back end
255    
256          New alpha instructions LDB/LDW have been added, as per Fermin's
257       suggestions.   This is unrelated to all other changes.
258    
259    6. X86 back end
260    
261         I've changed andl to testl in the floating point test sequence
262         whenever appropriate.  The Intel optimization guide states that
263         testl is perferable to andl.
264    
265    7. RA (x86 only)
266    
267         I've improved the spill propagation algorithm, using an approximation
268       of maximal weighted independent sets.   This seems to be necessary to
269       alleviate the negative effect in light of the slow down in (1).
270    
271         I'll write down the algorithm one of these days.
272    
273    8. MLRiscGen: frequencies
274    
275         I've added an annotation that states that all call gc blocks have zero
276       execution frequencies.  This improves register allocation on the x86.
277    
278    BENCHMARKS
279    ==========
280    
281       I've only perform the comparison on 110.25.
282    
283       The platforms are:
284    
285        HPPA  A four processor HP machine (E9000) with 5G of memory.
286        X86   A 300Hhz Pentium II with 128M of memory, and
287        SPARC An Ultra sparc 2 with 512M of memory.
288    
289       I used the following parameters for the SML benchmarks:
290    
291                 @SMLalloc
292         HPPA    256k
293         SPARC   512k
294         X86     256k
295    
296    COMPILATION TIME
297    ----------------
298       Here are the numbers comparing the compilation times of the compilers.
299       I've only compared 110.25 compiling the new sources versus
300       a fixpoint version of the new compiler compiling the same.
301    
302                     110.25                                  New
303               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
304         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
305         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
306         X86    999s    315s       94006+130691   987s    296s    108877+141957
307    
308                   110.25         New
309                Code Size      Code Size
310         HPPA   8596736         8561421
311         SPARC  8974299         8785143
312         X86    9029180         8716783
313    
314       So in summary, things are at least as good as before.   Dramatic
315       reduction in compilation is obtained on the Sparc; I can't explain it,
316       but it is reproducible.  Perhaps someone should try to reproduce this
317       on their own machines.
318    
319    SML BENCHMARKS
320    --------------
321    
322        On the average, all benchmarks perform at least as well as before.
323    
324          HPPA         Compilation Time     Spill+Reload      Run Time
325                     110.25  New            110.25    New   110.25  New
326    
327          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
328              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
329       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
330                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
331        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
332             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
333               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
334              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
335         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
336             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
337            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
338      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
339                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
340             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
341                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
342               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
343      --------------------------------------------------------------------------
344       Average                     6.12%                                   4.09%
345    
346          SPARC        Compilation Time     Spill+Reload      Run Time
347                     110.25  New            110.25    New   110.25  New
348    
349          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
350              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
351       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
352                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
353        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
354             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
355               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
356              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
357         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
358             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
359            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
360      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
361                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
362             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
363                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
364               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
365      ----------------------------------------------------------------------------
366       Average                     6.94%                                    2.64%
367    
368          X86          Compilation Time     Spill+Reload      Run Time
369                     110.25  New            110.25    New   110.25  New
370    
371          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
372              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
373       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
374                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
375        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
376             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
377               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
378              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
379         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
380             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
381            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
382      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
383                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
384             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
385                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
386               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
387      ----------------------------------------------------------------------------
388       Average                     1.22%                                     3.36%
389    
390    ----------------------------------------------------------------------
391    Name: Allen Leung
392    Date: 2000/03/23 16:25:00
393    Tag: leunga-20000323-fix_x86_alpha
394    Description:
395    
396    1. X86 fixes/changes
397    
398       a.  The old code generated for SETcc was completely wrong.
399           The Intel optimization guide is VERY misleading.
400    
401    2. ALPHA fixes/changes
402    
403       a.  Added the instructions LDBU, LDWU, STB, STW as per Fermin's suggestion.
404       b.  Added a new mode byteWordLoadStores to the functor parameter to Alpha()
405       c.  Added reassociation code for address computation.
406    
407    ----------------------------------------------------------------------
408    Name: Allen Leung
409    Date: 2000/03/22 01:23:00
410    Tag: leunga-20000322-fix_x86_hppa_ra
411    Description:
412    
413    1. X86 fixes/changes
414    
415       a.  x86Rewrite bug with MUL3 (found by Lal)
416       b.  Added the instructions FSTS, FSTL
417    
418    2. PA-RISC fixes/changes
419    
420       a.  B label should not be a delay slot candidate!  Why did this work?
421       b.  ADDT(32, REG(32, r), LI n) now generates one instruction instead of two,
422           as it should be.
423       c.  The assembly syntax for fstds and fstdd was wrong.
424       d.  Added the composite instruction COMICLR/LDO, which is the immediate
425           operand variant of COMCLR/LDO.
426    
427    3. Generic MLRISC
428    
429       a.  shuffle.sml rewritten to be slightly more efficient
430       b.  DIV bug in mltree-simplify fixed (found by Fermin)
431    
432    4. Register Allocator
433    
434       a.  I now release the interference graph earlier during spilling.
435           May improve memory usage.
436    
437    ----------------------------------------------------------------------
438    Name: Matthias Blume
439    Date: 2000/03/14 14:15:32
440    Tag: blume_main_v110p26p1_2
441    Description:
442    
443    1. Tools.registerStdShellCmdTool (from smlnj/cm/tool.cm) takes an
444    additional argument called "template" which is an optional string that
445    specifiel the layout of the tool command line.  See the CM manual for
446    explanation.
447    
448    2. A special-purpose tool can be "regisitered" by simply dropping the
449    corresponding <...>-tool.cm (and/or <...>-ext.cm) into the same
450    directory where the .cm file lives that uses this tool.  (The
451    behavior/misfeature until now was to look for the tool description
452    files in the current working directory.)  As before, tool description
453    files could also be anchored -- in which case they can live anywhere
454    they like.  Following the recent e-mail discussion, this change should
455    make it easier to have special-purpose tools that are shipped together
456    with the sources of the program that uses them.
457    
458    ----------------------------------------------------------------------
459    Name: Matthias Blume
460    Date: 2000/03/10 07:48:34
461    Tag: blume_main_v110p26p1_1
462    Description:
463    
464    I added a re-written version of Dave's fixpt script to src/system.
465    Changes relative to the original version:
466      - sh-ified (not everybody has ksh)
467      - automatically figures out which architecture it runs on
468      - uses ./makeml a bit more cleverly
469      - never invokes ./installml (and, thus, does not clobber your
470        good and working installation of sml in case something goes wrong)
471      - accepts max iteration count using option "-iter <n>"
472      - accepts a "base" name using option "-base <base>"
473    
474    It does not build any extraneous heap images but directly rebuilds
475    bin- and boot-hierarchies using makeml's "-rebuild" switch. Finally,
476    it can incorporate existing bin- and boot- hierarchies.  For example,
477    suppose the base is set to "sml" (which is the default).  Then it
478    successively builds
479    
480            sml.bin.<arch>-unix and sml.boot.<arch>-unix
481    then    sml1.bin.<arch>-unix and sml1.boot.<arch>-unix
482    then    sml2.bin.<arch>-unix and sml2.boot.<arch>-unix
483    ...
484    then    sml<n>.bin.<arch>-unix and sml<n>.boot.<arch>-unix
485    
486    and so on.  If any of these already exist, it will just use what's
487    there.  In particular, many people will have the initial set of bin
488    and boot files around, so this saves time for at least one full
489    rebuild.  Having sets of the form <base><k>.{bin,boot}.<arch>-unix for
490    <k>=1,2,... is normally not a good idea when invoking fixpt.  However,
491    they might be the result of an earlier partial run of fixpt (which
492    perhaps got accidentially killed).  In this case, fixpt will quickly
493    move through what exists before continuing where it left off earlier,
494    and, thus, saves a lot of time.
495    
496    ----------------------------------------------------------------------
497    Name: Allen Leung
498    Date: 00/03/10 02:20:00
499    Tag: leunga-20000310-fix_x86_asm_ra
500    Description:
501    
502    More assembly output problems involving the indexed addressing mode
503    on the x86 have been found and corrected. Thanks to Fermin Reig for the
504    fix.
505    
506    The interface and implementation of the register allocator have been changed
507    slightly to accommodate the possibility to skip the register allocation
508    phases completely and go directly to memory allocation.  This is needed
509    for C-- use.
510    
511    ----------------------------------------------------------------------
512    Name: Matthias Blume
513    Date: 00/03/09 10:23:53
514    Tag: blume_main_v110p26p1_0
515    Description:
516    
517    * Complete re-organization of library names.  Many libraries have been
518    consolidated so that they share the same path anchor.  For example,
519    all MLRISC-related libraries are anchored at MLRISC, most libraries that
520    are SML/NJ-specific are under "smlnj".  Notice that names like
521    host-cmb.cm or host-compiler.cm no longer exist.  See system/README
522    for a complete description of the new naming scheme.  Quick reference:
523    
524       host-cmb.cm        -> smlnj/cmb.cm
525       host-compiler.cm   -> smlnj/compiler.cm
526       full-cm.cm         -> smlnj/cm.cm
527       <arch>-<os>.cm     -> smlnj/cmb/<arch>-<os>.cm
528       <arch>-compiler.cm -> smlnj/compiler/<arch>.cm
529    
530    * Bug fixes in CM.
531        - exceptions in user code are being passed through (i.e., reach top level)
532        - more bugs in paranoia mode fixed
533        - bug related to checking group owners fixed
534    
535    * New install.sh script that automagically fetches archive files:
536      The new file config/srcarchiveurl must contain the URL of the
537      (remote) directory that contains bin files (or other source archives).
538      If install.sh does not find the archive locally, it tries to get
539      it from that remote directory.
540      This should simplify installation further:  For machines that have
541      access to the internet, just fetch <version>-config.tgz, unpack it,
542      edit config/targets, and go (run config/install.sh).  The scipt will
543      fetch everything else that it might need all by itself.
544    
545      For CVS users, this mechanism is not relevant for source archives, but
546      it is convenient for getting new sets of binfiles.
547    
548      Archives should be tar files compressed with either gzip, compress, or
549      bzip2.  The script recognizes .tgz, .tar, tar.gz, tz, .tar.Z, and .tar.bz2.
550    
551  ----------------------------------------------------------------------  ----------------------------------------------------------------------
552  Name: Matthias Blume  Name: Matthias Blume

Legend:
Removed from v.572  
changed lines
  Added in v.591

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0