Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 583, Thu Mar 23 21:52:30 2000 UTC revision 593, Tue Apr 4 07:26:33 2000 UTC
# Line 13  Line 13 
13  Description:  Description:
14  ----------------------------------------------------------------------  ----------------------------------------------------------------------
15  Name: Allen Leung  Name: Allen Leung
16    Date: 2000/04/04 03:18:00 EST
17    Tag: leunga-20000404-C--Moby
18    Description:
19    
20        All of these fixes are related to C--, Moby, and my own optimization
21        stuff; so they shouldn't affect SML/NJ.
22    
23    1.  X86
24    
25        Various fixes related floating point, and extensions.
26    
27    2.  Alpha
28    
29        Some extra patterns related to loads with signed/zero extension
30        provided by Fermin.
31    
32    3.  Assembly
33    
34        When generating assemby, resolve the value of client defined constants,
35        instead of generating symbolic values.  This is controlled by the
36        new flag "asm-resolve-constants", which is default to true.
37    
38    4.  Machine Descriptions
39    
40        a. The precedence parser was slightly broken when parsing infixr symbols.
41        b. The type generalizing code had the bound variables reversed, resulting
42           in a problem during arity raising.
43        c. Various fixes in machine descriptions.
44    
45    ----------------------------------------------------------------------
46    Name: Matthias Blume
47    Date: 2000/04/03 16:05:00 JST
48    Tag: blume_main_v110p26p2_2
49    Description:
50    
51    I eliminated coreEnv from compInfo.  Access to the "Core" structure is
52    now done via the ordinary static environment that is context to each
53    compilation unit.
54    
55    To this end, I arranged that instead of "structure Core" as "structure
56    _Core" is bound in the pervasive environment.  Core access is done via
57    _Core (which can never be accidentially rebound because _Core is not a
58    legal surface-syntax symbol).
59    
60    The current solution is much cleaner because the core environment is
61    now simply part of the pervasive environment which is part of every
62    compilation unit's context anyway.  In particular, this eliminates all
63    special-case handling that was necessary until now in order to deal
64    with dynamic and symbolic parts of the core environment.
65    
66    Remaining hackery (to bind the "magic" symbol _Core) is localized in the
67    compilation mananger's bootstrap compiler (actually: in the "init group"
68    handling).  See the comments in src/system/smlnj/init/init.cmi for
69    more details.
70    
71    I also tried to track down all mentions of "Core" (as string argument
72    to Symbol.strSymbol) in the compiler and replaced them with a
73    reference to the new CoreSym.coreSym.  Seems cleaner since the actual
74    name appears in one place only.
75    
76    Binfile and bootfile format have not changed, but the switchover from
77    the old "init.cmi" to the new one is a bit tricky, so I supplied new
78    bootfiles anyway.
79    
80    ----------------------------------------------------------------------
81    Name: Allen Leung
82    Date: 2000/04/02 21:17:00 EST
83    Tag: leunga-20000402-mltree
84    Description:
85    
86       1. Renamed the constructor CALL in MLTREE by popular demand.
87       2. Added a bunch of files from my repository.  These are currently
88          used by other non-SMLNJ backends.
89    
90    ----------------------------------------------------------------------
91    Name: Allen Leung
92    Date: 2000/03/31 21:15:00 EST
93    Tag: leunga-20000331-aliasing
94    Description:
95    
96    This update contains a rewritten (and hopefully more correct) module
97    for extracting aliasing information from CPS.
98    
99       To turn on this feature:
100    
101            Compiler.Control.CG.memDisambiguate := true
102    
103       To pretty print the region information with assembly
104    
105           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
106    
107       To control how many levels of aliasing information are printed, use:
108    
109           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
110    
111       The default of n is 3.
112    
113    ----------------------------------------------------------------------
114    Name: David MacQueen
115    Date: 2000/03/31 11:15:00 EST
116    Tag: dbm-20000331-runtime_fix
117    Description:
118    
119    This update contains:
120    
121    1. runtime/c-lib/c-libraries.c
122       includes added in revision 1.2 caused compilation errors on hppa-hpux
123    
124    2. fix for bug 1556
125       system/Basis/Implementation/NJ/internal-signals.sml
126    
127    ----------------------------------------------------------------------
128    Name: Matthias Blume
129    Date: 2000/03/31 18:00:00 JST
130    Tag: blume_main_v110p26p2_1
131    Description:
132    
133    This update contains:
134    
135    1. A small change to CM's handling of stable libraries:
136       CM now maintains one "global" modmap that is used for all stable
137       libraries.  The use of such a global modmap maximizes sharing and
138       minimizes the need for re-traversing parts of environments during
139       modmap construction.  (However, this has minor impact since modmap
140       construction seems to account for just one percent or less of total
141       compile time.)
142    
143    2. I added a "genmap" phase to the statistics.  This is where I got the
144       "one percent" number (see above).
145    
146    3. CM's new tool parameter mechanism just became _even_ better. :)
147       - The parser understands named parameters and recursive options.
148       - The "make" and "shell" tools use these new features.
149         (This makes it a lot easier to cascade these tools.)
150       - There is a small syntax change: named parameters use a
151    
152           <name> : ( <option> ... )            or
153           <name> : <string>
154    
155         syntax.  Previously, named parameters were implemented in an
156         ad-hoc fashion by each tool individually (by parsing strings)
157         and had the form
158    
159           <name>=<string>
160    
161       See the CM manual for a full description of these issues.
162    
163    ----------------------------------------------------------------------
164    Name: Matthias Blume
165    Date: 2000/03/30 18:00:00 JST
166    Tag: blume_main_v110p26p2_0
167    Description:
168    
169    !!!!! WARNING !!!!!!
170    !!  New binfiles  !!
171    !!!!!!!!!!!!!!!!!!!!
172    
173    This update contains:
174    
175    1. Moderate changes to CM:
176    
177       - Changes to CM's tools mechanism.  In particular, it is now possible
178       to have tools that accept additional "command line" parameters
179       (specified in the .cm file at each instance where the tool's class is
180       used).
181    
182       This was done to accomodate the new "make" and "shell" tools which
183       facilitate fairly seemless hookup to portions of code managed using
184       Makefiles or Shell scripts.
185    
186       There are no classes "shared" or "private" anymore.  Instead, the
187       sharing annotation is now a parameter to the "sml" class.
188    
189       There is a bit of generic machinery for implementing one's own
190       tools that accept command-line parameters.  However, I am not yet fully
191       satisfied with that part, so expect changes here in the future.
192    
193       All existing tools are described in the CM manual.
194    
195       - Slightly better error handling.  (CM now surpresses many followup
196       error messages that tended to be more annoying than helpful.)
197    
198    2. Major changes to the compiler's static environment data structures.
199    
200       - no CMStaticEnv anymore.
201            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
202              but it is called Environment), no conversions between different
203              kinds of static environments
204    
205       - There is still a notion of a "modmap", but such modmaps are generated
206         on demand at the time when they are needed.  This sounds slow, but I
207         sped up the code that generates modmaps enough for this not to lead to
208         a slowdown of the compiler (at least I didn't detect any).
209    
210       - To facilitate rapid modmap generation, static environments now
211         contain an (optional) "modtree" structure.  Modtree annotations are
212         constructed by the unpickler during unpickling.  (This means that
213         the elaborator does not have to worry about modtrees at all.)
214         Modtrees have the advantage that they are compositional in the same
215         way as the environment data structure itself is compositional.
216         As a result, modtrees never hang on to parts of an environment that
217         has already been rendered "stale" by filtering or rebinding.
218    
219       - I went through many, many trials and errors before arriving at the
220         current solution.  (The initial idea of "linkpaths" did not work.)
221         But the result of all this is that I have touched a lot of files that
222         depend on the "modules" and "types" data structures (most of the
223         elaborator). There were a lot of changes during my "linkpath" trials
224         that could have been reverted to their original state but weren't.
225         Please, don't be too harsh on me for messing with this code a bit more
226         than what was strictly necessary...  (I _did_ resist the tempation
227         of doing any "global reformatting" to avoid an untimely death at
228         Dave's hands. :)
229    
230       - One positive aspect of the previous point:  At least I made sure that
231         all files that I touched now compile without warnings (other than
232         "polyEqual").
233    
234       - compiler now tends to run "leaner" (i.e., ties up less memory in
235         redundant modmaps)
236    
237    ----------------------------------------------------------------------
238    Name: Allen Leung
239    Date: 2000/03/29 18:00:00
240    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
241    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
242    Description:
243    
244       This update contains *MAJOR* changes to the way code is generated from CPS
245    in the module mlriscGen, and in various backend modules.
246    
247    CHANGES
248    =======
249    
250    1. MLRiscGen: forward propagation fix.
251    
252       There was a bug in forward propagation introduced at about the same time
253       as the MLRISC x86 backend, which prohibits coalescing to be
254       performed effectively in loops.
255    
256       Effect: speed up of loops in RISC architectures.
257               By itself, this actually slowed down certain benchmarks on the x86.
258    
259    2. MLRiscGen:  forward propagating addresses from consing.
260    
261       I've changed the way consing code is generated.  Basically I separated
262       out the initialization part:
263    
264            store tag,   offset(allocptr)
265            store elem1, offset+4(allocptr)
266            store elem2, offset+8(allocptr)
267            ...
268            store elemn, offset+4n(allocptr)
269    
270       and the address computation part:
271    
272            celladdr <- offset+4+alloctpr
273    
274       and move the address computation part
275    
276       Effect:  register pressure is generally lower as a result.  This
277                makes compilation of certain expressions much faster, such as
278                long lists with non-trivial elements.
279    
280                 [(0,0), (0,0), .... (0,0)]
281    
282    3. MLRiscGen: base pointer elimination.
283    
284        As part of the linkage mechanism, we generate the sequence:
285    
286         L:  ...  <- start of the code fragment
287    
288         L1:
289             base pointer <- linkreg - L1 + L
290    
291         The base pointer was then used for computing relocatable addresses
292       in the code fragment.  Frequently (such as in lots of continuations)
293       this is not needed.  We now eliminate this sequence whenever possible.
294    
295         For compile time efficiency, I'm using a very stupid local heuristic.
296       But in general, this should be done as a control flow analysis.
297    
298       Effect:  Smaller code size.  Speed up of most programs.
299    
300    4. Hppa back end
301    
302         Long jumps in span dependence resolution used to depend on the existence
303      of the base pointer.
304    
305         A jump to a long label L was expanded into the following sequence:
306    
307          LDIL %hi(L-8192), %r29
308          LDO  %lo(L-8192)(%r29), %r29
309          ADD  %r29, baseptr, %r29
310          BV,n %r0(%r29)
311    
312         In the presence of change (3) above, this will not work.  I've changed
313       it so that the following sequence of instructions are generated, which
314       doesn't mention the base pointer at all:
315    
316             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
317        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
318             BV,n  %r0(%r29)          /* Jump */
319    
320    5. Alpha back end
321    
322          New alpha instructions LDB/LDW have been added, as per Fermin's
323       suggestions.   This is unrelated to all other changes.
324    
325    6. X86 back end
326    
327         I've changed andl to testl in the floating point test sequence
328         whenever appropriate.  The Intel optimization guide states that
329         testl is perferable to andl.
330    
331    7. RA (x86 only)
332    
333         I've improved the spill propagation algorithm, using an approximation
334       of maximal weighted independent sets.   This seems to be necessary to
335       alleviate the negative effect in light of the slow down in (1).
336    
337         I'll write down the algorithm one of these days.
338    
339    8. MLRiscGen: frequencies
340    
341         I've added an annotation that states that all call gc blocks have zero
342       execution frequencies.  This improves register allocation on the x86.
343    
344    BENCHMARKS
345    ==========
346    
347       I've only perform the comparison on 110.25.
348    
349       The platforms are:
350    
351        HPPA  A four processor HP machine (E9000) with 5G of memory.
352        X86   A 300Hhz Pentium II with 128M of memory, and
353        SPARC An Ultra sparc 2 with 512M of memory.
354    
355       I used the following parameters for the SML benchmarks:
356    
357                 @SMLalloc
358         HPPA    256k
359         SPARC   512k
360         X86     256k
361    
362    COMPILATION TIME
363    ----------------
364       Here are the numbers comparing the compilation times of the compilers.
365       I've only compared 110.25 compiling the new sources versus
366       a fixpoint version of the new compiler compiling the same.
367    
368                     110.25                                  New
369               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
370         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
371         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
372         X86    999s    315s       94006+130691   987s    296s    108877+141957
373    
374                   110.25         New
375                Code Size      Code Size
376         HPPA   8596736         8561421
377         SPARC  8974299         8785143
378         X86    9029180         8716783
379    
380       So in summary, things are at least as good as before.   Dramatic
381       reduction in compilation is obtained on the Sparc; I can't explain it,
382       but it is reproducible.  Perhaps someone should try to reproduce this
383       on their own machines.
384    
385    SML BENCHMARKS
386    --------------
387    
388        On the average, all benchmarks perform at least as well as before.
389    
390          HPPA         Compilation Time     Spill+Reload      Run Time
391                     110.25  New            110.25    New   110.25  New
392    
393          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
394              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
395       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
396                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
397        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
398             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
399               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
400              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
401         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
402             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
403            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
404      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
405                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
406             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
407                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
408               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
409      --------------------------------------------------------------------------
410       Average                     6.12%                                   4.09%
411    
412          SPARC        Compilation Time     Spill+Reload      Run Time
413                     110.25  New            110.25    New   110.25  New
414    
415          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
416              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
417       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
418                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
419        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
420             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
421               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
422              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
423         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
424             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
425            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
426      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
427                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
428             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
429                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
430               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
431      ----------------------------------------------------------------------------
432       Average                     6.94%                                    2.64%
433    
434          X86          Compilation Time     Spill+Reload      Run Time
435                     110.25  New            110.25    New   110.25  New
436    
437          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
438              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
439       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
440                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
441        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
442             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
443               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
444              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
445         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
446             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
447            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
448      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
449                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
450             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
451                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
452               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
453      ----------------------------------------------------------------------------
454       Average                     1.22%                                     3.36%
455    
456    ----------------------------------------------------------------------
457    Name: Allen Leung
458  Date: 2000/03/23 16:25:00  Date: 2000/03/23 16:25:00
459  Tag: leunga-20000323-fix_x86_alpha  Tag: leunga-20000323-fix_x86_alpha
460  Description:  Description:

Legend:
Removed from v.583  
changed lines
  Added in v.593

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0