Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 583, Thu Mar 23 21:52:30 2000 UTC revision 594, Tue Apr 4 23:41:47 2000 UTC
# Line 13  Line 13 
13  Description:  Description:
14  ----------------------------------------------------------------------  ----------------------------------------------------------------------
15  Name: Allen Leung  Name: Allen Leung
16    Date: 2000/04/04 19:39:00 EST
17    Tag: leunga-20000404-x86-asm
18    Description:
19    
20    1.  Fixed a problem in X86 assembly.
21    
22        Things like
23    
24           jmp %eax
25           jmp (%eax)
26    
27        should be output as
28    
29           jmp *%eax
30           jmp *(%eax)
31    
32    2.  Assembly output
33    
34          Added a new flag
35    
36              "asm-indent-copies" (default to false)
37    
38          When this flag is on, parallel copies will be indented an extra level.
39    
40    ----------------------------------------------------------------------
41    Name: Allen Leung
42    Date: 2000/04/04 03:18:00 EST
43    Tag: leunga-20000404-C--Moby
44    Description:
45    
46        All of these fixes are related to C--, Moby, and my own optimization
47        stuff; so they shouldn't affect SML/NJ.
48    
49    1.  X86
50    
51        Various fixes related floating point, and extensions.
52    
53    2.  Alpha
54    
55        Some extra patterns related to loads with signed/zero extension
56        provided by Fermin.
57    
58    3.  Assembly
59    
60        When generating assemby, resolve the value of client defined constants,
61        instead of generating symbolic values.  This is controlled by the
62        new flag "asm-resolve-constants", which is default to true.
63    
64    4.  Machine Descriptions
65    
66        a. The precedence parser was slightly broken when parsing infixr symbols.
67        b. The type generalizing code had the bound variables reversed, resulting
68           in a problem during arity raising.
69        c. Various fixes in machine descriptions.
70    
71    ----------------------------------------------------------------------
72    Name: Matthias Blume
73    Date: 2000/04/03 16:05:00 JST
74    Tag: blume_main_v110p26p2_2
75    Description:
76    
77    I eliminated coreEnv from compInfo.  Access to the "Core" structure is
78    now done via the ordinary static environment that is context to each
79    compilation unit.
80    
81    To this end, I arranged that instead of "structure Core" as "structure
82    _Core" is bound in the pervasive environment.  Core access is done via
83    _Core (which can never be accidentially rebound because _Core is not a
84    legal surface-syntax symbol).
85    
86    The current solution is much cleaner because the core environment is
87    now simply part of the pervasive environment which is part of every
88    compilation unit's context anyway.  In particular, this eliminates all
89    special-case handling that was necessary until now in order to deal
90    with dynamic and symbolic parts of the core environment.
91    
92    Remaining hackery (to bind the "magic" symbol _Core) is localized in the
93    compilation mananger's bootstrap compiler (actually: in the "init group"
94    handling).  See the comments in src/system/smlnj/init/init.cmi for
95    more details.
96    
97    I also tried to track down all mentions of "Core" (as string argument
98    to Symbol.strSymbol) in the compiler and replaced them with a
99    reference to the new CoreSym.coreSym.  Seems cleaner since the actual
100    name appears in one place only.
101    
102    Binfile and bootfile format have not changed, but the switchover from
103    the old "init.cmi" to the new one is a bit tricky, so I supplied new
104    bootfiles anyway.
105    
106    ----------------------------------------------------------------------
107    Name: Allen Leung
108    Date: 2000/04/02 21:17:00 EST
109    Tag: leunga-20000402-mltree
110    Description:
111    
112       1. Renamed the constructor CALL in MLTREE by popular demand.
113       2. Added a bunch of files from my repository.  These are currently
114          used by other non-SMLNJ backends.
115    
116    ----------------------------------------------------------------------
117    Name: Allen Leung
118    Date: 2000/03/31 21:15:00 EST
119    Tag: leunga-20000331-aliasing
120    Description:
121    
122    This update contains a rewritten (and hopefully more correct) module
123    for extracting aliasing information from CPS.
124    
125       To turn on this feature:
126    
127            Compiler.Control.CG.memDisambiguate := true
128    
129       To pretty print the region information with assembly
130    
131           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
132    
133       To control how many levels of aliasing information are printed, use:
134    
135           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
136    
137       The default of n is 3.
138    
139    ----------------------------------------------------------------------
140    Name: David MacQueen
141    Date: 2000/03/31 11:15:00 EST
142    Tag: dbm-20000331-runtime_fix
143    Description:
144    
145    This update contains:
146    
147    1. runtime/c-lib/c-libraries.c
148       includes added in revision 1.2 caused compilation errors on hppa-hpux
149    
150    2. fix for bug 1556
151       system/Basis/Implementation/NJ/internal-signals.sml
152    
153    ----------------------------------------------------------------------
154    Name: Matthias Blume
155    Date: 2000/03/31 18:00:00 JST
156    Tag: blume_main_v110p26p2_1
157    Description:
158    
159    This update contains:
160    
161    1. A small change to CM's handling of stable libraries:
162       CM now maintains one "global" modmap that is used for all stable
163       libraries.  The use of such a global modmap maximizes sharing and
164       minimizes the need for re-traversing parts of environments during
165       modmap construction.  (However, this has minor impact since modmap
166       construction seems to account for just one percent or less of total
167       compile time.)
168    
169    2. I added a "genmap" phase to the statistics.  This is where I got the
170       "one percent" number (see above).
171    
172    3. CM's new tool parameter mechanism just became _even_ better. :)
173       - The parser understands named parameters and recursive options.
174       - The "make" and "shell" tools use these new features.
175         (This makes it a lot easier to cascade these tools.)
176       - There is a small syntax change: named parameters use a
177    
178           <name> : ( <option> ... )            or
179           <name> : <string>
180    
181         syntax.  Previously, named parameters were implemented in an
182         ad-hoc fashion by each tool individually (by parsing strings)
183         and had the form
184    
185           <name>=<string>
186    
187       See the CM manual for a full description of these issues.
188    
189    ----------------------------------------------------------------------
190    Name: Matthias Blume
191    Date: 2000/03/30 18:00:00 JST
192    Tag: blume_main_v110p26p2_0
193    Description:
194    
195    !!!!! WARNING !!!!!!
196    !!  New binfiles  !!
197    !!!!!!!!!!!!!!!!!!!!
198    
199    This update contains:
200    
201    1. Moderate changes to CM:
202    
203       - Changes to CM's tools mechanism.  In particular, it is now possible
204       to have tools that accept additional "command line" parameters
205       (specified in the .cm file at each instance where the tool's class is
206       used).
207    
208       This was done to accomodate the new "make" and "shell" tools which
209       facilitate fairly seemless hookup to portions of code managed using
210       Makefiles or Shell scripts.
211    
212       There are no classes "shared" or "private" anymore.  Instead, the
213       sharing annotation is now a parameter to the "sml" class.
214    
215       There is a bit of generic machinery for implementing one's own
216       tools that accept command-line parameters.  However, I am not yet fully
217       satisfied with that part, so expect changes here in the future.
218    
219       All existing tools are described in the CM manual.
220    
221       - Slightly better error handling.  (CM now surpresses many followup
222       error messages that tended to be more annoying than helpful.)
223    
224    2. Major changes to the compiler's static environment data structures.
225    
226       - no CMStaticEnv anymore.
227            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
228              but it is called Environment), no conversions between different
229              kinds of static environments
230    
231       - There is still a notion of a "modmap", but such modmaps are generated
232         on demand at the time when they are needed.  This sounds slow, but I
233         sped up the code that generates modmaps enough for this not to lead to
234         a slowdown of the compiler (at least I didn't detect any).
235    
236       - To facilitate rapid modmap generation, static environments now
237         contain an (optional) "modtree" structure.  Modtree annotations are
238         constructed by the unpickler during unpickling.  (This means that
239         the elaborator does not have to worry about modtrees at all.)
240         Modtrees have the advantage that they are compositional in the same
241         way as the environment data structure itself is compositional.
242         As a result, modtrees never hang on to parts of an environment that
243         has already been rendered "stale" by filtering or rebinding.
244    
245       - I went through many, many trials and errors before arriving at the
246         current solution.  (The initial idea of "linkpaths" did not work.)
247         But the result of all this is that I have touched a lot of files that
248         depend on the "modules" and "types" data structures (most of the
249         elaborator). There were a lot of changes during my "linkpath" trials
250         that could have been reverted to their original state but weren't.
251         Please, don't be too harsh on me for messing with this code a bit more
252         than what was strictly necessary...  (I _did_ resist the tempation
253         of doing any "global reformatting" to avoid an untimely death at
254         Dave's hands. :)
255    
256       - One positive aspect of the previous point:  At least I made sure that
257         all files that I touched now compile without warnings (other than
258         "polyEqual").
259    
260       - compiler now tends to run "leaner" (i.e., ties up less memory in
261         redundant modmaps)
262    
263    ----------------------------------------------------------------------
264    Name: Allen Leung
265    Date: 2000/03/29 18:00:00
266    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
267    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
268    Description:
269    
270       This update contains *MAJOR* changes to the way code is generated from CPS
271    in the module mlriscGen, and in various backend modules.
272    
273    CHANGES
274    =======
275    
276    1. MLRiscGen: forward propagation fix.
277    
278       There was a bug in forward propagation introduced at about the same time
279       as the MLRISC x86 backend, which prohibits coalescing to be
280       performed effectively in loops.
281    
282       Effect: speed up of loops in RISC architectures.
283               By itself, this actually slowed down certain benchmarks on the x86.
284    
285    2. MLRiscGen:  forward propagating addresses from consing.
286    
287       I've changed the way consing code is generated.  Basically I separated
288       out the initialization part:
289    
290            store tag,   offset(allocptr)
291            store elem1, offset+4(allocptr)
292            store elem2, offset+8(allocptr)
293            ...
294            store elemn, offset+4n(allocptr)
295    
296       and the address computation part:
297    
298            celladdr <- offset+4+alloctpr
299    
300       and move the address computation part
301    
302       Effect:  register pressure is generally lower as a result.  This
303                makes compilation of certain expressions much faster, such as
304                long lists with non-trivial elements.
305    
306                 [(0,0), (0,0), .... (0,0)]
307    
308    3. MLRiscGen: base pointer elimination.
309    
310        As part of the linkage mechanism, we generate the sequence:
311    
312         L:  ...  <- start of the code fragment
313    
314         L1:
315             base pointer <- linkreg - L1 + L
316    
317         The base pointer was then used for computing relocatable addresses
318       in the code fragment.  Frequently (such as in lots of continuations)
319       this is not needed.  We now eliminate this sequence whenever possible.
320    
321         For compile time efficiency, I'm using a very stupid local heuristic.
322       But in general, this should be done as a control flow analysis.
323    
324       Effect:  Smaller code size.  Speed up of most programs.
325    
326    4. Hppa back end
327    
328         Long jumps in span dependence resolution used to depend on the existence
329      of the base pointer.
330    
331         A jump to a long label L was expanded into the following sequence:
332    
333          LDIL %hi(L-8192), %r29
334          LDO  %lo(L-8192)(%r29), %r29
335          ADD  %r29, baseptr, %r29
336          BV,n %r0(%r29)
337    
338         In the presence of change (3) above, this will not work.  I've changed
339       it so that the following sequence of instructions are generated, which
340       doesn't mention the base pointer at all:
341    
342             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
343        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
344             BV,n  %r0(%r29)          /* Jump */
345    
346    5. Alpha back end
347    
348          New alpha instructions LDB/LDW have been added, as per Fermin's
349       suggestions.   This is unrelated to all other changes.
350    
351    6. X86 back end
352    
353         I've changed andl to testl in the floating point test sequence
354         whenever appropriate.  The Intel optimization guide states that
355         testl is perferable to andl.
356    
357    7. RA (x86 only)
358    
359         I've improved the spill propagation algorithm, using an approximation
360       of maximal weighted independent sets.   This seems to be necessary to
361       alleviate the negative effect in light of the slow down in (1).
362    
363         I'll write down the algorithm one of these days.
364    
365    8. MLRiscGen: frequencies
366    
367         I've added an annotation that states that all call gc blocks have zero
368       execution frequencies.  This improves register allocation on the x86.
369    
370    BENCHMARKS
371    ==========
372    
373       I've only perform the comparison on 110.25.
374    
375       The platforms are:
376    
377        HPPA  A four processor HP machine (E9000) with 5G of memory.
378        X86   A 300Hhz Pentium II with 128M of memory, and
379        SPARC An Ultra sparc 2 with 512M of memory.
380    
381       I used the following parameters for the SML benchmarks:
382    
383                 @SMLalloc
384         HPPA    256k
385         SPARC   512k
386         X86     256k
387    
388    COMPILATION TIME
389    ----------------
390       Here are the numbers comparing the compilation times of the compilers.
391       I've only compared 110.25 compiling the new sources versus
392       a fixpoint version of the new compiler compiling the same.
393    
394                     110.25                                  New
395               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
396         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
397         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
398         X86    999s    315s       94006+130691   987s    296s    108877+141957
399    
400                   110.25         New
401                Code Size      Code Size
402         HPPA   8596736         8561421
403         SPARC  8974299         8785143
404         X86    9029180         8716783
405    
406       So in summary, things are at least as good as before.   Dramatic
407       reduction in compilation is obtained on the Sparc; I can't explain it,
408       but it is reproducible.  Perhaps someone should try to reproduce this
409       on their own machines.
410    
411    SML BENCHMARKS
412    --------------
413    
414        On the average, all benchmarks perform at least as well as before.
415    
416          HPPA         Compilation Time     Spill+Reload      Run Time
417                     110.25  New            110.25    New   110.25  New
418    
419          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
420              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
421       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
422                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
423        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
424             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
425               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
426              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
427         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
428             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
429            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
430      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
431                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
432             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
433                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
434               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
435      --------------------------------------------------------------------------
436       Average                     6.12%                                   4.09%
437    
438          SPARC        Compilation Time     Spill+Reload      Run Time
439                     110.25  New            110.25    New   110.25  New
440    
441          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
442              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
443       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
444                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
445        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
446             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
447               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
448              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
449         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
450             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
451            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
452      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
453                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
454             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
455                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
456               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
457      ----------------------------------------------------------------------------
458       Average                     6.94%                                    2.64%
459    
460          X86          Compilation Time     Spill+Reload      Run Time
461                     110.25  New            110.25    New   110.25  New
462    
463          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
464              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
465       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
466                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
467        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
468             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
469               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
470              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
471         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
472             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
473            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
474      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
475                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
476             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
477                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
478               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
479      ----------------------------------------------------------------------------
480       Average                     1.22%                                     3.36%
481    
482    ----------------------------------------------------------------------
483    Name: Allen Leung
484  Date: 2000/03/23 16:25:00  Date: 2000/03/23 16:25:00
485  Tag: leunga-20000323-fix_x86_alpha  Tag: leunga-20000323-fix_x86_alpha
486  Description:  Description:

Legend:
Removed from v.583  
changed lines
  Added in v.594

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0