Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 583, Thu Mar 23 21:52:30 2000 UTC revision 592, Mon Apr 3 07:04:12 2000 UTC
# Line 12  Line 12 
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14  ----------------------------------------------------------------------  ----------------------------------------------------------------------
15    Name: Matthias Blume
16    Date: 2000/04/03 16:05:00 JST
17    Tag: blume_main_v110p26p2_2
18    Description:
19    
20    I eliminated coreEnv from compInfo.  Access to the "Core" structure is
21    now done via the ordinary static environment that is context to each
22    compilation unit.
23    
24    To this end, I arranged that instead of "structure Core" as "structure
25    _Core" is bound in the pervasive environment.  Core access is done via
26    _Core (which can never be accidentially rebound because _Core is not a
27    legal surface-syntax symbol).
28    
29    The current solution is much cleaner because the core environment is
30    now simply part of the pervasive environment which is part of every
31    compilation unit's context anyway.  In particular, this eliminates all
32    special-case handling that was necessary until now in order to deal
33    with dynamic and symbolic parts of the core environment.
34    
35    Remaining hackery (to bind the "magic" symbol _Core) is localized in the
36    compilation mananger's bootstrap compiler (actually: in the "init group"
37    handling).  See the comments in src/system/smlnj/init/init.cmi for
38    more details.
39    
40    I also tried to track down all mentions of "Core" (as string argument
41    to Symbol.strSymbol) in the compiler and replaced them with a
42    reference to the new CoreSym.coreSym.  Seems cleaner since the actual
43    name appears in one place only.
44    
45    Binfile and bootfile format have not changed, but the switchover from
46    the old "init.cmi" to the new one is a bit tricky, so I supplied new
47    bootfiles anyway.
48    
49    ----------------------------------------------------------------------
50    Name: Allen Leung
51    Date: 2000/04/02 21:17:00 EST
52    Tag: leunga-20000402-mltree
53    Description:
54    
55       1. Renamed the constructor CALL in MLTREE by popular demand.
56       2. Added a bunch of files from my repository.  These are currently
57          used by other non-SMLNJ backends.
58    
59    ----------------------------------------------------------------------
60    Name: Allen Leung
61    Date: 2000/03/31 21:15:00 EST
62    Tag: leunga-20000331-aliasing
63    Description:
64    
65    This update contains a rewritten (and hopefully more correct) module
66    for extracting aliasing information from CPS.
67    
68       To turn on this feature:
69    
70            Compiler.Control.CG.memDisambiguate := true
71    
72       To pretty print the region information with assembly
73    
74           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
75    
76       To control how many levels of aliasing information are printed, use:
77    
78           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
79    
80       The default of n is 3.
81    
82    ----------------------------------------------------------------------
83    Name: David MacQueen
84    Date: 2000/03/31 11:15:00 EST
85    Tag: dbm-20000331-runtime_fix
86    Description:
87    
88    This update contains:
89    
90    1. runtime/c-lib/c-libraries.c
91       includes added in revision 1.2 caused compilation errors on hppa-hpux
92    
93    2. fix for bug 1556
94       system/Basis/Implementation/NJ/internal-signals.sml
95    
96    ----------------------------------------------------------------------
97    Name: Matthias Blume
98    Date: 2000/03/31 18:00:00 JST
99    Tag: blume_main_v110p26p2_1
100    Description:
101    
102    This update contains:
103    
104    1. A small change to CM's handling of stable libraries:
105       CM now maintains one "global" modmap that is used for all stable
106       libraries.  The use of such a global modmap maximizes sharing and
107       minimizes the need for re-traversing parts of environments during
108       modmap construction.  (However, this has minor impact since modmap
109       construction seems to account for just one percent or less of total
110       compile time.)
111    
112    2. I added a "genmap" phase to the statistics.  This is where I got the
113       "one percent" number (see above).
114    
115    3. CM's new tool parameter mechanism just became _even_ better. :)
116       - The parser understands named parameters and recursive options.
117       - The "make" and "shell" tools use these new features.
118         (This makes it a lot easier to cascade these tools.)
119       - There is a small syntax change: named parameters use a
120    
121           <name> : ( <option> ... )            or
122           <name> : <string>
123    
124         syntax.  Previously, named parameters were implemented in an
125         ad-hoc fashion by each tool individually (by parsing strings)
126         and had the form
127    
128           <name>=<string>
129    
130       See the CM manual for a full description of these issues.
131    
132    ----------------------------------------------------------------------
133    Name: Matthias Blume
134    Date: 2000/03/30 18:00:00 JST
135    Tag: blume_main_v110p26p2_0
136    Description:
137    
138    !!!!! WARNING !!!!!!
139    !!  New binfiles  !!
140    !!!!!!!!!!!!!!!!!!!!
141    
142    This update contains:
143    
144    1. Moderate changes to CM:
145    
146       - Changes to CM's tools mechanism.  In particular, it is now possible
147       to have tools that accept additional "command line" parameters
148       (specified in the .cm file at each instance where the tool's class is
149       used).
150    
151       This was done to accomodate the new "make" and "shell" tools which
152       facilitate fairly seemless hookup to portions of code managed using
153       Makefiles or Shell scripts.
154    
155       There are no classes "shared" or "private" anymore.  Instead, the
156       sharing annotation is now a parameter to the "sml" class.
157    
158       There is a bit of generic machinery for implementing one's own
159       tools that accept command-line parameters.  However, I am not yet fully
160       satisfied with that part, so expect changes here in the future.
161    
162       All existing tools are described in the CM manual.
163    
164       - Slightly better error handling.  (CM now surpresses many followup
165       error messages that tended to be more annoying than helpful.)
166    
167    2. Major changes to the compiler's static environment data structures.
168    
169       - no CMStaticEnv anymore.
170            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
171              but it is called Environment), no conversions between different
172              kinds of static environments
173    
174       - There is still a notion of a "modmap", but such modmaps are generated
175         on demand at the time when they are needed.  This sounds slow, but I
176         sped up the code that generates modmaps enough for this not to lead to
177         a slowdown of the compiler (at least I didn't detect any).
178    
179       - To facilitate rapid modmap generation, static environments now
180         contain an (optional) "modtree" structure.  Modtree annotations are
181         constructed by the unpickler during unpickling.  (This means that
182         the elaborator does not have to worry about modtrees at all.)
183         Modtrees have the advantage that they are compositional in the same
184         way as the environment data structure itself is compositional.
185         As a result, modtrees never hang on to parts of an environment that
186         has already been rendered "stale" by filtering or rebinding.
187    
188       - I went through many, many trials and errors before arriving at the
189         current solution.  (The initial idea of "linkpaths" did not work.)
190         But the result of all this is that I have touched a lot of files that
191         depend on the "modules" and "types" data structures (most of the
192         elaborator). There were a lot of changes during my "linkpath" trials
193         that could have been reverted to their original state but weren't.
194         Please, don't be too harsh on me for messing with this code a bit more
195         than what was strictly necessary...  (I _did_ resist the tempation
196         of doing any "global reformatting" to avoid an untimely death at
197         Dave's hands. :)
198    
199       - One positive aspect of the previous point:  At least I made sure that
200         all files that I touched now compile without warnings (other than
201         "polyEqual").
202    
203       - compiler now tends to run "leaner" (i.e., ties up less memory in
204         redundant modmaps)
205    
206    ----------------------------------------------------------------------
207    Name: Allen Leung
208    Date: 2000/03/29 18:00:00
209    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
210    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
211    Description:
212    
213       This update contains *MAJOR* changes to the way code is generated from CPS
214    in the module mlriscGen, and in various backend modules.
215    
216    CHANGES
217    =======
218    
219    1. MLRiscGen: forward propagation fix.
220    
221       There was a bug in forward propagation introduced at about the same time
222       as the MLRISC x86 backend, which prohibits coalescing to be
223       performed effectively in loops.
224    
225       Effect: speed up of loops in RISC architectures.
226               By itself, this actually slowed down certain benchmarks on the x86.
227    
228    2. MLRiscGen:  forward propagating addresses from consing.
229    
230       I've changed the way consing code is generated.  Basically I separated
231       out the initialization part:
232    
233            store tag,   offset(allocptr)
234            store elem1, offset+4(allocptr)
235            store elem2, offset+8(allocptr)
236            ...
237            store elemn, offset+4n(allocptr)
238    
239       and the address computation part:
240    
241            celladdr <- offset+4+alloctpr
242    
243       and move the address computation part
244    
245       Effect:  register pressure is generally lower as a result.  This
246                makes compilation of certain expressions much faster, such as
247                long lists with non-trivial elements.
248    
249                 [(0,0), (0,0), .... (0,0)]
250    
251    3. MLRiscGen: base pointer elimination.
252    
253        As part of the linkage mechanism, we generate the sequence:
254    
255         L:  ...  <- start of the code fragment
256    
257         L1:
258             base pointer <- linkreg - L1 + L
259    
260         The base pointer was then used for computing relocatable addresses
261       in the code fragment.  Frequently (such as in lots of continuations)
262       this is not needed.  We now eliminate this sequence whenever possible.
263    
264         For compile time efficiency, I'm using a very stupid local heuristic.
265       But in general, this should be done as a control flow analysis.
266    
267       Effect:  Smaller code size.  Speed up of most programs.
268    
269    4. Hppa back end
270    
271         Long jumps in span dependence resolution used to depend on the existence
272      of the base pointer.
273    
274         A jump to a long label L was expanded into the following sequence:
275    
276          LDIL %hi(L-8192), %r29
277          LDO  %lo(L-8192)(%r29), %r29
278          ADD  %r29, baseptr, %r29
279          BV,n %r0(%r29)
280    
281         In the presence of change (3) above, this will not work.  I've changed
282       it so that the following sequence of instructions are generated, which
283       doesn't mention the base pointer at all:
284    
285             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
286        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
287             BV,n  %r0(%r29)          /* Jump */
288    
289    5. Alpha back end
290    
291          New alpha instructions LDB/LDW have been added, as per Fermin's
292       suggestions.   This is unrelated to all other changes.
293    
294    6. X86 back end
295    
296         I've changed andl to testl in the floating point test sequence
297         whenever appropriate.  The Intel optimization guide states that
298         testl is perferable to andl.
299    
300    7. RA (x86 only)
301    
302         I've improved the spill propagation algorithm, using an approximation
303       of maximal weighted independent sets.   This seems to be necessary to
304       alleviate the negative effect in light of the slow down in (1).
305    
306         I'll write down the algorithm one of these days.
307    
308    8. MLRiscGen: frequencies
309    
310         I've added an annotation that states that all call gc blocks have zero
311       execution frequencies.  This improves register allocation on the x86.
312    
313    BENCHMARKS
314    ==========
315    
316       I've only perform the comparison on 110.25.
317    
318       The platforms are:
319    
320        HPPA  A four processor HP machine (E9000) with 5G of memory.
321        X86   A 300Hhz Pentium II with 128M of memory, and
322        SPARC An Ultra sparc 2 with 512M of memory.
323    
324       I used the following parameters for the SML benchmarks:
325    
326                 @SMLalloc
327         HPPA    256k
328         SPARC   512k
329         X86     256k
330    
331    COMPILATION TIME
332    ----------------
333       Here are the numbers comparing the compilation times of the compilers.
334       I've only compared 110.25 compiling the new sources versus
335       a fixpoint version of the new compiler compiling the same.
336    
337                     110.25                                  New
338               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
339         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
340         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
341         X86    999s    315s       94006+130691   987s    296s    108877+141957
342    
343                   110.25         New
344                Code Size      Code Size
345         HPPA   8596736         8561421
346         SPARC  8974299         8785143
347         X86    9029180         8716783
348    
349       So in summary, things are at least as good as before.   Dramatic
350       reduction in compilation is obtained on the Sparc; I can't explain it,
351       but it is reproducible.  Perhaps someone should try to reproduce this
352       on their own machines.
353    
354    SML BENCHMARKS
355    --------------
356    
357        On the average, all benchmarks perform at least as well as before.
358    
359          HPPA         Compilation Time     Spill+Reload      Run Time
360                     110.25  New            110.25    New   110.25  New
361    
362          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
363              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
364       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
365                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
366        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
367             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
368               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
369              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
370         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
371             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
372            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
373      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
374                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
375             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
376                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
377               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
378      --------------------------------------------------------------------------
379       Average                     6.12%                                   4.09%
380    
381          SPARC        Compilation Time     Spill+Reload      Run Time
382                     110.25  New            110.25    New   110.25  New
383    
384          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
385              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
386       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
387                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
388        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
389             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
390               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
391              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
392         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
393             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
394            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
395      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
396                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
397             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
398                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
399               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
400      ----------------------------------------------------------------------------
401       Average                     6.94%                                    2.64%
402    
403          X86          Compilation Time     Spill+Reload      Run Time
404                     110.25  New            110.25    New   110.25  New
405    
406          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
407              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
408       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
409                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
410        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
411             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
412               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
413              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
414         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
415             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
416            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
417      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
418                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
419             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
420                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
421               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
422      ----------------------------------------------------------------------------
423       Average                     1.22%                                     3.36%
424    
425    ----------------------------------------------------------------------
426  Name: Allen Leung  Name: Allen Leung
427  Date: 2000/03/23 16:25:00  Date: 2000/03/23 16:25:00
428  Tag: leunga-20000323-fix_x86_alpha  Tag: leunga-20000323-fix_x86_alpha

Legend:
Removed from v.583  
changed lines
  Added in v.592

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0