Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/NOTES/HISTORY
ViewVC logotype

Diff of /sml/trunk/NOTES/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 583, Thu Mar 23 21:52:30 2000 UTC revision 615, Wed Apr 12 04:52:09 2000 UTC
# Line 12  Line 12 
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14  ----------------------------------------------------------------------  ----------------------------------------------------------------------
15    Name: Matthias Blume
16    Date: 2000/04/12 13:52:00 JST
17    Tag: blume_main_v110p27_1
18    Description:
19    
20    Changed install.sh script to handle archive files without version number
21    and to use "boot.<arch>-<os>" instead of "sml.boot.<arch>-<os>" for the
22    name of the boot file archive.
23    
24    ----------------------------------------------------------------------
25    Name: Dave MacQueen
26    Date: 2000/04/09 14:00 EDT
27    Tag: dbm-20000410-Version_110_27
28    Description:
29    
30    1. Updated src/compiler/TopLevel/main/version.sml to version 110.27
31    
32    2. Updated src/config/version to 110.27
33    
34    3. New boot files!
35    
36    ----------------------------------------------------------------------
37    Name: Allen Leung
38    Date: 2000/04/09 19:09:00 EST
39    Tag: leunga-20000409-misc
40    Description:
41    
42    1.  Yet another fix for x86 assembly for idivl, imull, mull and friends.
43    
44    2.  Miscellaneous improvements to MLRISC (unused in sml/nj)
45    
46    ----------------------------------------------------------------------
47    Name: Stefan
48    Date: 2000/04/07 10:00:00 EDT
49    Tag: monnier-20000406-branch-handling
50    Description:
51    
52    Improved handling of branches (mostly those generated from
53    polymorphic equality), removed switchoff and changed the
54    default optimization settings (more cpsopt and less flintopt).
55    
56    ----------------------------------------------------------------------
57    Name: Allen Leung
58    Date: 2000/04/06 01:30:00 EST
59    Tag: leunga-20000406-peephole-x86-SSA-2
60    Description:
61    
62       Forgot a few files.
63    
64    ----------------------------------------------------------------------
65    Name: Allen Leung
66    Date: 2000/04/06 00:36:00 EST
67    Tag: leunga-20000406-peephole-x86-SSA
68    Description:
69    
70    1.  New Peephole code
71    
72    2.  Minor improvement to X86 instruction selection
73    
74    3.  Various fixes to SSA and machine description -> code translator
75    
76    ----------------------------------------------------------------------
77    Name: Matthias Blume
78    Date: 2000/04/05 12:30:00 JST
79    Tag: blume_main_v110p26p2_3
80    Description:
81    
82    This update just merges three minor cosmetic updates to CM's sources
83    to get ready for the 110.27 code freeze on Friday.  No functionality
84    has changed.
85    
86    ----------------------------------------------------------------------
87    Name: Allen Leung
88    Date: 2000/04/04 19:39:00 EST
89    Tag: leunga-20000404-x86-asm
90    Description:
91    
92    1.  Fixed a problem in X86 assembly.
93    
94        Things like
95    
96           jmp %eax
97           jmp (%eax)
98    
99        should be output as
100    
101           jmp *%eax
102           jmp *(%eax)
103    
104    2.  Assembly output
105    
106          Added a new flag
107    
108              "asm-indent-copies" (default to false)
109    
110          When this flag is on, parallel copies will be indented an extra level.
111    
112    ----------------------------------------------------------------------
113    Name: Allen Leung
114    Date: 2000/04/04 03:18:00 EST
115    Tag: leunga-20000404-C--Moby
116    Description:
117    
118        All of these fixes are related to C--, Moby, and my own optimization
119        stuff; so they shouldn't affect SML/NJ.
120    
121    1.  X86
122    
123        Various fixes related floating point, and extensions.
124    
125    2.  Alpha
126    
127        Some extra patterns related to loads with signed/zero extension
128        provided by Fermin.
129    
130    3.  Assembly
131    
132        When generating assemby, resolve the value of client defined constants,
133        instead of generating symbolic values.  This is controlled by the
134        new flag "asm-resolve-constants", which is default to true.
135    
136    4.  Machine Descriptions
137    
138        a. The precedence parser was slightly broken when parsing infixr symbols.
139        b. The type generalizing code had the bound variables reversed, resulting
140           in a problem during arity raising.
141        c. Various fixes in machine descriptions.
142    
143    ----------------------------------------------------------------------
144    Name: Matthias Blume
145    Date: 2000/04/03 16:05:00 JST
146    Tag: blume_main_v110p26p2_2
147    Description:
148    
149    I eliminated coreEnv from compInfo.  Access to the "Core" structure is
150    now done via the ordinary static environment that is context to each
151    compilation unit.
152    
153    To this end, I arranged that instead of "structure Core" as "structure
154    _Core" is bound in the pervasive environment.  Core access is done via
155    _Core (which can never be accidentially rebound because _Core is not a
156    legal surface-syntax symbol).
157    
158    The current solution is much cleaner because the core environment is
159    now simply part of the pervasive environment which is part of every
160    compilation unit's context anyway.  In particular, this eliminates all
161    special-case handling that was necessary until now in order to deal
162    with dynamic and symbolic parts of the core environment.
163    
164    Remaining hackery (to bind the "magic" symbol _Core) is localized in the
165    compilation mananger's bootstrap compiler (actually: in the "init group"
166    handling).  See the comments in src/system/smlnj/init/init.cmi for
167    more details.
168    
169    I also tried to track down all mentions of "Core" (as string argument
170    to Symbol.strSymbol) in the compiler and replaced them with a
171    reference to the new CoreSym.coreSym.  Seems cleaner since the actual
172    name appears in one place only.
173    
174    Binfile and bootfile format have not changed, but the switchover from
175    the old "init.cmi" to the new one is a bit tricky, so I supplied new
176    bootfiles anyway.
177    
178    ----------------------------------------------------------------------
179    Name: Allen Leung
180    Date: 2000/04/02 21:17:00 EST
181    Tag: leunga-20000402-mltree
182    Description:
183    
184       1. Renamed the constructor CALL in MLTREE by popular demand.
185       2. Added a bunch of files from my repository.  These are currently
186          used by other non-SMLNJ backends.
187    
188    ----------------------------------------------------------------------
189    Name: Allen Leung
190    Date: 2000/03/31 21:15:00 EST
191    Tag: leunga-20000331-aliasing
192    Description:
193    
194    This update contains a rewritten (and hopefully more correct) module
195    for extracting aliasing information from CPS.
196    
197       To turn on this feature:
198    
199            Compiler.Control.CG.memDisambiguate := true
200    
201       To pretty print the region information with assembly
202    
203           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
204    
205       To control how many levels of aliasing information are printed, use:
206    
207           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
208    
209       The default of n is 3.
210    
211    ----------------------------------------------------------------------
212    Name: David MacQueen
213    Date: 2000/03/31 11:15:00 EST
214    Tag: dbm-20000331-runtime_fix
215    Description:
216    
217    This update contains:
218    
219    1. runtime/c-lib/c-libraries.c
220       includes added in revision 1.2 caused compilation errors on hppa-hpux
221    
222    2. fix for bug 1556
223       system/Basis/Implementation/NJ/internal-signals.sml
224    
225    ----------------------------------------------------------------------
226    Name: Matthias Blume
227    Date: 2000/03/31 18:00:00 JST
228    Tag: blume_main_v110p26p2_1
229    Description:
230    
231    This update contains:
232    
233    1. A small change to CM's handling of stable libraries:
234       CM now maintains one "global" modmap that is used for all stable
235       libraries.  The use of such a global modmap maximizes sharing and
236       minimizes the need for re-traversing parts of environments during
237       modmap construction.  (However, this has minor impact since modmap
238       construction seems to account for just one percent or less of total
239       compile time.)
240    
241    2. I added a "genmap" phase to the statistics.  This is where I got the
242       "one percent" number (see above).
243    
244    3. CM's new tool parameter mechanism just became _even_ better. :)
245       - The parser understands named parameters and recursive options.
246       - The "make" and "shell" tools use these new features.
247         (This makes it a lot easier to cascade these tools.)
248       - There is a small syntax change: named parameters use a
249    
250           <name> : ( <option> ... )            or
251           <name> : <string>
252    
253         syntax.  Previously, named parameters were implemented in an
254         ad-hoc fashion by each tool individually (by parsing strings)
255         and had the form
256    
257           <name>=<string>
258    
259       See the CM manual for a full description of these issues.
260    
261    ----------------------------------------------------------------------
262    Name: Matthias Blume
263    Date: 2000/03/30 18:00:00 JST
264    Tag: blume_main_v110p26p2_0
265    Description:
266    
267    !!!!! WARNING !!!!!!
268    !!  New binfiles  !!
269    !!!!!!!!!!!!!!!!!!!!
270    
271    This update contains:
272    
273    1. Moderate changes to CM:
274    
275       - Changes to CM's tools mechanism.  In particular, it is now possible
276       to have tools that accept additional "command line" parameters
277       (specified in the .cm file at each instance where the tool's class is
278       used).
279    
280       This was done to accomodate the new "make" and "shell" tools which
281       facilitate fairly seemless hookup to portions of code managed using
282       Makefiles or Shell scripts.
283    
284       There are no classes "shared" or "private" anymore.  Instead, the
285       sharing annotation is now a parameter to the "sml" class.
286    
287       There is a bit of generic machinery for implementing one's own
288       tools that accept command-line parameters.  However, I am not yet fully
289       satisfied with that part, so expect changes here in the future.
290    
291       All existing tools are described in the CM manual.
292    
293       - Slightly better error handling.  (CM now surpresses many followup
294       error messages that tended to be more annoying than helpful.)
295    
296    2. Major changes to the compiler's static environment data structures.
297    
298       - no CMStaticEnv anymore.
299            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
300              but it is called Environment), no conversions between different
301              kinds of static environments
302    
303       - There is still a notion of a "modmap", but such modmaps are generated
304         on demand at the time when they are needed.  This sounds slow, but I
305         sped up the code that generates modmaps enough for this not to lead to
306         a slowdown of the compiler (at least I didn't detect any).
307    
308       - To facilitate rapid modmap generation, static environments now
309         contain an (optional) "modtree" structure.  Modtree annotations are
310         constructed by the unpickler during unpickling.  (This means that
311         the elaborator does not have to worry about modtrees at all.)
312         Modtrees have the advantage that they are compositional in the same
313         way as the environment data structure itself is compositional.
314         As a result, modtrees never hang on to parts of an environment that
315         has already been rendered "stale" by filtering or rebinding.
316    
317       - I went through many, many trials and errors before arriving at the
318         current solution.  (The initial idea of "linkpaths" did not work.)
319         But the result of all this is that I have touched a lot of files that
320         depend on the "modules" and "types" data structures (most of the
321         elaborator). There were a lot of changes during my "linkpath" trials
322         that could have been reverted to their original state but weren't.
323         Please, don't be too harsh on me for messing with this code a bit more
324         than what was strictly necessary...  (I _did_ resist the tempation
325         of doing any "global reformatting" to avoid an untimely death at
326         Dave's hands. :)
327    
328       - One positive aspect of the previous point:  At least I made sure that
329         all files that I touched now compile without warnings (other than
330         "polyEqual").
331    
332       - compiler now tends to run "leaner" (i.e., ties up less memory in
333         redundant modmaps)
334    
335    ----------------------------------------------------------------------
336    Name: Allen Leung
337    Date: 2000/03/29 18:00:00
338    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
339    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
340    Description:
341    
342       This update contains *MAJOR* changes to the way code is generated from CPS
343    in the module mlriscGen, and in various backend modules.
344    
345    CHANGES
346    =======
347    
348    1. MLRiscGen: forward propagation fix.
349    
350       There was a bug in forward propagation introduced at about the same time
351       as the MLRISC x86 backend, which prohibits coalescing to be
352       performed effectively in loops.
353    
354       Effect: speed up of loops in RISC architectures.
355               By itself, this actually slowed down certain benchmarks on the x86.
356    
357    2. MLRiscGen:  forward propagating addresses from consing.
358    
359       I've changed the way consing code is generated.  Basically I separated
360       out the initialization part:
361    
362            store tag,   offset(allocptr)
363            store elem1, offset+4(allocptr)
364            store elem2, offset+8(allocptr)
365            ...
366            store elemn, offset+4n(allocptr)
367    
368       and the address computation part:
369    
370            celladdr <- offset+4+alloctpr
371    
372       and move the address computation part
373    
374       Effect:  register pressure is generally lower as a result.  This
375                makes compilation of certain expressions much faster, such as
376                long lists with non-trivial elements.
377    
378                 [(0,0), (0,0), .... (0,0)]
379    
380    3. MLRiscGen: base pointer elimination.
381    
382        As part of the linkage mechanism, we generate the sequence:
383    
384         L:  ...  <- start of the code fragment
385    
386         L1:
387             base pointer <- linkreg - L1 + L
388    
389         The base pointer was then used for computing relocatable addresses
390       in the code fragment.  Frequently (such as in lots of continuations)
391       this is not needed.  We now eliminate this sequence whenever possible.
392    
393         For compile time efficiency, I'm using a very stupid local heuristic.
394       But in general, this should be done as a control flow analysis.
395    
396       Effect:  Smaller code size.  Speed up of most programs.
397    
398    4. Hppa back end
399    
400         Long jumps in span dependence resolution used to depend on the existence
401      of the base pointer.
402    
403         A jump to a long label L was expanded into the following sequence:
404    
405          LDIL %hi(L-8192), %r29
406          LDO  %lo(L-8192)(%r29), %r29
407          ADD  %r29, baseptr, %r29
408          BV,n %r0(%r29)
409    
410         In the presence of change (3) above, this will not work.  I've changed
411       it so that the following sequence of instructions are generated, which
412       doesn't mention the base pointer at all:
413    
414             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
415        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
416             BV,n  %r0(%r29)          /* Jump */
417    
418    5. Alpha back end
419    
420          New alpha instructions LDB/LDW have been added, as per Fermin's
421       suggestions.   This is unrelated to all other changes.
422    
423    6. X86 back end
424    
425         I've changed andl to testl in the floating point test sequence
426         whenever appropriate.  The Intel optimization guide states that
427         testl is perferable to andl.
428    
429    7. RA (x86 only)
430    
431         I've improved the spill propagation algorithm, using an approximation
432       of maximal weighted independent sets.   This seems to be necessary to
433       alleviate the negative effect in light of the slow down in (1).
434    
435         I'll write down the algorithm one of these days.
436    
437    8. MLRiscGen: frequencies
438    
439         I've added an annotation that states that all call gc blocks have zero
440       execution frequencies.  This improves register allocation on the x86.
441    
442    BENCHMARKS
443    ==========
444    
445       I've only perform the comparison on 110.25.
446    
447       The platforms are:
448    
449        HPPA  A four processor HP machine (E9000) with 5G of memory.
450        X86   A 300Hhz Pentium II with 128M of memory, and
451        SPARC An Ultra sparc 2 with 512M of memory.
452    
453       I used the following parameters for the SML benchmarks:
454    
455                 @SMLalloc
456         HPPA    256k
457         SPARC   512k
458         X86     256k
459    
460    COMPILATION TIME
461    ----------------
462       Here are the numbers comparing the compilation times of the compilers.
463       I've only compared 110.25 compiling the new sources versus
464       a fixpoint version of the new compiler compiling the same.
465    
466                     110.25                                  New
467               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
468         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
469         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
470         X86    999s    315s       94006+130691   987s    296s    108877+141957
471    
472                   110.25         New
473                Code Size      Code Size
474         HPPA   8596736         8561421
475         SPARC  8974299         8785143
476         X86    9029180         8716783
477    
478       So in summary, things are at least as good as before.   Dramatic
479       reduction in compilation is obtained on the Sparc; I can't explain it,
480       but it is reproducible.  Perhaps someone should try to reproduce this
481       on their own machines.
482    
483    SML BENCHMARKS
484    --------------
485    
486        On the average, all benchmarks perform at least as well as before.
487    
488          HPPA         Compilation Time     Spill+Reload      Run Time
489                     110.25  New            110.25    New   110.25  New
490    
491          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
492              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
493       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
494                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
495        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
496             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
497               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
498              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
499         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
500             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
501            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
502      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
503                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
504             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
505                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
506               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
507      --------------------------------------------------------------------------
508       Average                     6.12%                                   4.09%
509    
510          SPARC        Compilation Time     Spill+Reload      Run Time
511                     110.25  New            110.25    New   110.25  New
512    
513          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
514              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
515       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
516                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
517        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
518             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
519               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
520              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
521         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
522             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
523            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
524      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
525                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
526             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
527                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
528               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
529      ----------------------------------------------------------------------------
530       Average                     6.94%                                    2.64%
531    
532          X86          Compilation Time     Spill+Reload      Run Time
533                     110.25  New            110.25    New   110.25  New
534    
535          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
536              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
537       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
538                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
539        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
540             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
541               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
542              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
543         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
544             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
545            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
546      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
547                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
548             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
549                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
550               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
551      ----------------------------------------------------------------------------
552       Average                     1.22%                                     3.36%
553    
554    ----------------------------------------------------------------------
555  Name: Allen Leung  Name: Allen Leung
556  Date: 2000/03/23 16:25:00  Date: 2000/03/23 16:25:00
557  Tag: leunga-20000323-fix_x86_alpha  Tag: leunga-20000323-fix_x86_alpha

Legend:
Removed from v.583  
changed lines
  Added in v.615

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0