Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 570, Wed Mar 8 17:30:13 2000 UTC revision 628, Tue Apr 25 21:03:34 2000 UTC
# Line 11  Line 11 
11  Date:  Date:
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14    ----------------------------------------------------------------------
15    Name: Allen Leung
16    Date: 2000/04/25 17:00:00 EST
17    Tag: leunga-20000425-alpha-ra
18    Description:
19    
20    1. Alpha
21    
22        PSEUDOARITH was missing in AlphaRewrite.  This causes an endless loop
23    in C--.
24    
25    2. RA
26    
27       Added a flag "ra-dump-size" to print out the size of the flowgraph
28       and the interference graph.
29    
30  ----------------------------------------------------------------------  ----------------------------------------------------------------------
31  Name:  Name: Dave MacQueen
32  Date:  Date: 2000/04/25/
33  Tag:  Tag: dbm-20000425-mlyacc_doc_examples
34    Description:
35      Updated mlyacc.tex sections 5 and 7 for SML '97 and CM.
36      Updated all three examples in src/ml-yacc/examples to run
37      under 110.* using CM.make.
38    
39    ----------------------------------------------------------------------
40    Name: Allen Leung
41    Date: 2000/04/20 23:04:00 EST
42    Tag: leunga-20000420-ssa-c---stuff
43    Description:
44    
45      This update synchronizes my repository with Yale's.  Most of these
46    changes, however, do not affect SML/NJ at all (the RA is an exception).
47    
48    1. Register Allocator
49    
50       a. An improvement in the interference graph construction:
51          Given a copy
52    
53                s <- t
54    
55          no interference edge between s and t is added for this definition of s.
56    
57       b. I've added two new spill heuristic modules that Fermin and I developed
58          (in the new library RA.cm). These are unused in SML/NJ but maybe
59          useful for others (Moby?)
60    
61    2. X86
62    
63       a. Various fixes in the backend provided by Fermin [C--] and Lal.
64    
65    3. Alpha
66    
67       a. Added the BSR instruction and code generation that goes with it [C--]
68       b. Other fixes too numerous to recount provided by Fermin [C--]
69    
70    4. Regmaps
71    
72       a. The regmaps are not initialized with the identity physical bindings
73          at creation time.  This is unneeded.
74    
75    5. MLRISC Optimizations
76    
77       a. The DJ-Graph module can now compute the iterated dominance frontiers
78          intersects with liveness incrementally in linear time! Woohoo!
79          This is now used in my new SSA construction algorithm.
80    
81       b. THe branch reorganization module is now smarter about linear chains of
82          basic blocks.
83    
84    
85    ----------------------------------------------------------------------
86    Name: Matthias Blume
87    Date: 2000/04/12 13:52:00 JST
88    Tag: blume_main_v110p27_1
89    Description:
90    
91    Changed install.sh script to handle archive files without version number
92    and to use "boot.<arch>-<os>" instead of "sml.boot.<arch>-<os>" for the
93    name of the boot file archive.
94    
95    ----------------------------------------------------------------------
96    Name: Dave MacQueen
97    Date: 2000/04/09 14:00 EDT
98    Tag: dbm-20000410-Version_110_27
99    Description:
100    
101    1. Updated src/compiler/TopLevel/main/version.sml to version 110.27
102    
103    2. Updated src/config/version to 110.27
104    
105    3. New boot files!
106    
107    ----------------------------------------------------------------------
108    Name: Allen Leung
109    Date: 2000/04/09 19:09:00 EST
110    Tag: leunga-20000409-misc
111    Description:
112    
113    1.  Yet another fix for x86 assembly for idivl, imull, mull and friends.
114    
115    2.  Miscellaneous improvements to MLRISC (unused in sml/nj)
116    
117    ----------------------------------------------------------------------
118    Name: Stefan
119    Date: 2000/04/07 10:00:00 EDT
120    Tag: monnier-20000406-branch-handling
121    Description:
122    
123    Improved handling of branches (mostly those generated from
124    polymorphic equality), removed switchoff and changed the
125    default optimization settings (more cpsopt and less flintopt).
126    
127    ----------------------------------------------------------------------
128    Name: Allen Leung
129    Date: 2000/04/06 01:30:00 EST
130    Tag: leunga-20000406-peephole-x86-SSA-2
131    Description:
132    
133       Forgot a few files.
134    
135    ----------------------------------------------------------------------
136    Name: Allen Leung
137    Date: 2000/04/06 00:36:00 EST
138    Tag: leunga-20000406-peephole-x86-SSA
139    Description:
140    
141    1.  New Peephole code
142    
143    2.  Minor improvement to X86 instruction selection
144    
145    3.  Various fixes to SSA and machine description -> code translator
146    
147    ----------------------------------------------------------------------
148    Name: Matthias Blume
149    Date: 2000/04/05 12:30:00 JST
150    Tag: blume_main_v110p26p2_3
151    Description:
152    
153    This update just merges three minor cosmetic updates to CM's sources
154    to get ready for the 110.27 code freeze on Friday.  No functionality
155    has changed.
156    
157    ----------------------------------------------------------------------
158    Name: Allen Leung
159    Date: 2000/04/04 19:39:00 EST
160    Tag: leunga-20000404-x86-asm
161    Description:
162    
163    1.  Fixed a problem in X86 assembly.
164    
165        Things like
166    
167           jmp %eax
168           jmp (%eax)
169    
170        should be output as
171    
172           jmp *%eax
173           jmp *(%eax)
174    
175    2.  Assembly output
176    
177          Added a new flag
178    
179              "asm-indent-copies" (default to false)
180    
181          When this flag is on, parallel copies will be indented an extra level.
182    
183    ----------------------------------------------------------------------
184    Name: Allen Leung
185    Date: 2000/04/04 03:18:00 EST
186    Tag: leunga-20000404-C--Moby
187    Description:
188    
189        All of these fixes are related to C--, Moby, and my own optimization
190        stuff; so they shouldn't affect SML/NJ.
191    
192    1.  X86
193    
194        Various fixes related floating point, and extensions.
195    
196    2.  Alpha
197    
198        Some extra patterns related to loads with signed/zero extension
199        provided by Fermin.
200    
201    3.  Assembly
202    
203        When generating assemby, resolve the value of client defined constants,
204        instead of generating symbolic values.  This is controlled by the
205        new flag "asm-resolve-constants", which is default to true.
206    
207    4.  Machine Descriptions
208    
209        a. The precedence parser was slightly broken when parsing infixr symbols.
210        b. The type generalizing code had the bound variables reversed, resulting
211           in a problem during arity raising.
212        c. Various fixes in machine descriptions.
213    
214    ----------------------------------------------------------------------
215    Name: Matthias Blume
216    Date: 2000/04/03 16:05:00 JST
217    Tag: blume_main_v110p26p2_2
218    Description:
219    
220    I eliminated coreEnv from compInfo.  Access to the "Core" structure is
221    now done via the ordinary static environment that is context to each
222    compilation unit.
223    
224    To this end, I arranged that instead of "structure Core" as "structure
225    _Core" is bound in the pervasive environment.  Core access is done via
226    _Core (which can never be accidentially rebound because _Core is not a
227    legal surface-syntax symbol).
228    
229    The current solution is much cleaner because the core environment is
230    now simply part of the pervasive environment which is part of every
231    compilation unit's context anyway.  In particular, this eliminates all
232    special-case handling that was necessary until now in order to deal
233    with dynamic and symbolic parts of the core environment.
234    
235    Remaining hackery (to bind the "magic" symbol _Core) is localized in the
236    compilation mananger's bootstrap compiler (actually: in the "init group"
237    handling).  See the comments in src/system/smlnj/init/init.cmi for
238    more details.
239    
240    I also tried to track down all mentions of "Core" (as string argument
241    to Symbol.strSymbol) in the compiler and replaced them with a
242    reference to the new CoreSym.coreSym.  Seems cleaner since the actual
243    name appears in one place only.
244    
245    Binfile and bootfile format have not changed, but the switchover from
246    the old "init.cmi" to the new one is a bit tricky, so I supplied new
247    bootfiles anyway.
248    
249    ----------------------------------------------------------------------
250    Name: Allen Leung
251    Date: 2000/04/02 21:17:00 EST
252    Tag: leunga-20000402-mltree
253    Description:
254    
255       1. Renamed the constructor CALL in MLTREE by popular demand.
256       2. Added a bunch of files from my repository.  These are currently
257          used by other non-SMLNJ backends.
258    
259    ----------------------------------------------------------------------
260    Name: Allen Leung
261    Date: 2000/03/31 21:15:00 EST
262    Tag: leunga-20000331-aliasing
263    Description:
264    
265    This update contains a rewritten (and hopefully more correct) module
266    for extracting aliasing information from CPS.
267    
268       To turn on this feature:
269    
270            Compiler.Control.CG.memDisambiguate := true
271    
272       To pretty print the region information with assembly
273    
274           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
275    
276       To control how many levels of aliasing information are printed, use:
277    
278           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
279    
280       The default of n is 3.
281    
282    ----------------------------------------------------------------------
283    Name: David MacQueen
284    Date: 2000/03/31 11:15:00 EST
285    Tag: dbm-20000331-runtime_fix
286    Description:
287    
288    This update contains:
289    
290    1. runtime/c-lib/c-libraries.c
291       includes added in revision 1.2 caused compilation errors on hppa-hpux
292    
293    2. fix for bug 1556
294       system/Basis/Implementation/NJ/internal-signals.sml
295    
296    ----------------------------------------------------------------------
297    Name: Matthias Blume
298    Date: 2000/03/31 18:00:00 JST
299    Tag: blume_main_v110p26p2_1
300    Description:
301    
302    This update contains:
303    
304    1. A small change to CM's handling of stable libraries:
305       CM now maintains one "global" modmap that is used for all stable
306       libraries.  The use of such a global modmap maximizes sharing and
307       minimizes the need for re-traversing parts of environments during
308       modmap construction.  (However, this has minor impact since modmap
309       construction seems to account for just one percent or less of total
310       compile time.)
311    
312    2. I added a "genmap" phase to the statistics.  This is where I got the
313       "one percent" number (see above).
314    
315    3. CM's new tool parameter mechanism just became _even_ better. :)
316       - The parser understands named parameters and recursive options.
317       - The "make" and "shell" tools use these new features.
318         (This makes it a lot easier to cascade these tools.)
319       - There is a small syntax change: named parameters use a
320    
321           <name> : ( <option> ... )            or
322           <name> : <string>
323    
324         syntax.  Previously, named parameters were implemented in an
325         ad-hoc fashion by each tool individually (by parsing strings)
326         and had the form
327    
328           <name>=<string>
329    
330       See the CM manual for a full description of these issues.
331    
332    ----------------------------------------------------------------------
333    Name: Matthias Blume
334    Date: 2000/03/30 18:00:00 JST
335    Tag: blume_main_v110p26p2_0
336    Description:
337    
338    !!!!! WARNING !!!!!!
339    !!  New binfiles  !!
340    !!!!!!!!!!!!!!!!!!!!
341    
342    This update contains:
343    
344    1. Moderate changes to CM:
345    
346       - Changes to CM's tools mechanism.  In particular, it is now possible
347       to have tools that accept additional "command line" parameters
348       (specified in the .cm file at each instance where the tool's class is
349       used).
350    
351       This was done to accomodate the new "make" and "shell" tools which
352       facilitate fairly seemless hookup to portions of code managed using
353       Makefiles or Shell scripts.
354    
355       There are no classes "shared" or "private" anymore.  Instead, the
356       sharing annotation is now a parameter to the "sml" class.
357    
358       There is a bit of generic machinery for implementing one's own
359       tools that accept command-line parameters.  However, I am not yet fully
360       satisfied with that part, so expect changes here in the future.
361    
362       All existing tools are described in the CM manual.
363    
364       - Slightly better error handling.  (CM now surpresses many followup
365       error messages that tended to be more annoying than helpful.)
366    
367    2. Major changes to the compiler's static environment data structures.
368    
369       - no CMStaticEnv anymore.
370            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
371              but it is called Environment), no conversions between different
372              kinds of static environments
373    
374       - There is still a notion of a "modmap", but such modmaps are generated
375         on demand at the time when they are needed.  This sounds slow, but I
376         sped up the code that generates modmaps enough for this not to lead to
377         a slowdown of the compiler (at least I didn't detect any).
378    
379       - To facilitate rapid modmap generation, static environments now
380         contain an (optional) "modtree" structure.  Modtree annotations are
381         constructed by the unpickler during unpickling.  (This means that
382         the elaborator does not have to worry about modtrees at all.)
383         Modtrees have the advantage that they are compositional in the same
384         way as the environment data structure itself is compositional.
385         As a result, modtrees never hang on to parts of an environment that
386         has already been rendered "stale" by filtering or rebinding.
387    
388       - I went through many, many trials and errors before arriving at the
389         current solution.  (The initial idea of "linkpaths" did not work.)
390         But the result of all this is that I have touched a lot of files that
391         depend on the "modules" and "types" data structures (most of the
392         elaborator). There were a lot of changes during my "linkpath" trials
393         that could have been reverted to their original state but weren't.
394         Please, don't be too harsh on me for messing with this code a bit more
395         than what was strictly necessary...  (I _did_ resist the tempation
396         of doing any "global reformatting" to avoid an untimely death at
397         Dave's hands. :)
398    
399       - One positive aspect of the previous point:  At least I made sure that
400         all files that I touched now compile without warnings (other than
401         "polyEqual").
402    
403       - compiler now tends to run "leaner" (i.e., ties up less memory in
404         redundant modmaps)
405    
406    ----------------------------------------------------------------------
407    Name: Allen Leung
408    Date: 2000/03/29 18:00:00
409    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
410    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
411    Description:
412    
413       This update contains *MAJOR* changes to the way code is generated from CPS
414    in the module mlriscGen, and in various backend modules.
415    
416    CHANGES
417    =======
418    
419    1. MLRiscGen: forward propagation fix.
420    
421       There was a bug in forward propagation introduced at about the same time
422       as the MLRISC x86 backend, which prohibits coalescing to be
423       performed effectively in loops.
424    
425       Effect: speed up of loops in RISC architectures.
426               By itself, this actually slowed down certain benchmarks on the x86.
427    
428    2. MLRiscGen:  forward propagating addresses from consing.
429    
430       I've changed the way consing code is generated.  Basically I separated
431       out the initialization part:
432    
433            store tag,   offset(allocptr)
434            store elem1, offset+4(allocptr)
435            store elem2, offset+8(allocptr)
436            ...
437            store elemn, offset+4n(allocptr)
438    
439       and the address computation part:
440    
441            celladdr <- offset+4+alloctpr
442    
443       and move the address computation part
444    
445       Effect:  register pressure is generally lower as a result.  This
446                makes compilation of certain expressions much faster, such as
447                long lists with non-trivial elements.
448    
449                 [(0,0), (0,0), .... (0,0)]
450    
451    3. MLRiscGen: base pointer elimination.
452    
453        As part of the linkage mechanism, we generate the sequence:
454    
455         L:  ...  <- start of the code fragment
456    
457         L1:
458             base pointer <- linkreg - L1 + L
459    
460         The base pointer was then used for computing relocatable addresses
461       in the code fragment.  Frequently (such as in lots of continuations)
462       this is not needed.  We now eliminate this sequence whenever possible.
463    
464         For compile time efficiency, I'm using a very stupid local heuristic.
465       But in general, this should be done as a control flow analysis.
466    
467       Effect:  Smaller code size.  Speed up of most programs.
468    
469    4. Hppa back end
470    
471         Long jumps in span dependence resolution used to depend on the existence
472      of the base pointer.
473    
474         A jump to a long label L was expanded into the following sequence:
475    
476          LDIL %hi(L-8192), %r29
477          LDO  %lo(L-8192)(%r29), %r29
478          ADD  %r29, baseptr, %r29
479          BV,n %r0(%r29)
480    
481         In the presence of change (3) above, this will not work.  I've changed
482       it so that the following sequence of instructions are generated, which
483       doesn't mention the base pointer at all:
484    
485             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
486        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
487             BV,n  %r0(%r29)          /* Jump */
488    
489    5. Alpha back end
490    
491          New alpha instructions LDB/LDW have been added, as per Fermin's
492       suggestions.   This is unrelated to all other changes.
493    
494    6. X86 back end
495    
496         I've changed andl to testl in the floating point test sequence
497         whenever appropriate.  The Intel optimization guide states that
498         testl is perferable to andl.
499    
500    7. RA (x86 only)
501    
502         I've improved the spill propagation algorithm, using an approximation
503       of maximal weighted independent sets.   This seems to be necessary to
504       alleviate the negative effect in light of the slow down in (1).
505    
506         I'll write down the algorithm one of these days.
507    
508    8. MLRiscGen: frequencies
509    
510         I've added an annotation that states that all call gc blocks have zero
511       execution frequencies.  This improves register allocation on the x86.
512    
513    BENCHMARKS
514    ==========
515    
516       I've only perform the comparison on 110.25.
517    
518       The platforms are:
519    
520        HPPA  A four processor HP machine (E9000) with 5G of memory.
521        X86   A 300Hhz Pentium II with 128M of memory, and
522        SPARC An Ultra sparc 2 with 512M of memory.
523    
524       I used the following parameters for the SML benchmarks:
525    
526                 @SMLalloc
527         HPPA    256k
528         SPARC   512k
529         X86     256k
530    
531    COMPILATION TIME
532    ----------------
533       Here are the numbers comparing the compilation times of the compilers.
534       I've only compared 110.25 compiling the new sources versus
535       a fixpoint version of the new compiler compiling the same.
536    
537                     110.25                                  New
538               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
539         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
540         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
541         X86    999s    315s       94006+130691   987s    296s    108877+141957
542    
543                   110.25         New
544                Code Size      Code Size
545         HPPA   8596736         8561421
546         SPARC  8974299         8785143
547         X86    9029180         8716783
548    
549       So in summary, things are at least as good as before.   Dramatic
550       reduction in compilation is obtained on the Sparc; I can't explain it,
551       but it is reproducible.  Perhaps someone should try to reproduce this
552       on their own machines.
553    
554    SML BENCHMARKS
555    --------------
556    
557        On the average, all benchmarks perform at least as well as before.
558    
559          HPPA         Compilation Time     Spill+Reload      Run Time
560                     110.25  New            110.25    New   110.25  New
561    
562          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
563              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
564       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
565                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
566        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
567             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
568               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
569              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
570         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
571             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
572            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
573      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
574                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
575             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
576                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
577               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
578      --------------------------------------------------------------------------
579       Average                     6.12%                                   4.09%
580    
581          SPARC        Compilation Time     Spill+Reload      Run Time
582                     110.25  New            110.25    New   110.25  New
583    
584          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
585              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
586       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
587                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
588        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
589             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
590               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
591              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
592         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
593             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
594            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
595      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
596                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
597             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
598                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
599               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
600      ----------------------------------------------------------------------------
601       Average                     6.94%                                    2.64%
602    
603          X86          Compilation Time     Spill+Reload      Run Time
604                     110.25  New            110.25    New   110.25  New
605    
606          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
607              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
608       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
609                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
610        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
611             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
612               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
613              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
614         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
615             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
616            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
617      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
618                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
619             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
620                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
621               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
622      ----------------------------------------------------------------------------
623       Average                     1.22%                                     3.36%
624    
625    ----------------------------------------------------------------------
626    Name: Allen Leung
627    Date: 2000/03/23 16:25:00
628    Tag: leunga-20000323-fix_x86_alpha
629    Description:
630    
631    1. X86 fixes/changes
632    
633       a.  The old code generated for SETcc was completely wrong.
634           The Intel optimization guide is VERY misleading.
635    
636    2. ALPHA fixes/changes
637    
638       a.  Added the instructions LDBU, LDWU, STB, STW as per Fermin's suggestion.
639       b.  Added a new mode byteWordLoadStores to the functor parameter to Alpha()
640       c.  Added reassociation code for address computation.
641    
642    ----------------------------------------------------------------------
643    Name: Allen Leung
644    Date: 2000/03/22 01:23:00
645    Tag: leunga-20000322-fix_x86_hppa_ra
646    Description:
647    
648    1. X86 fixes/changes
649    
650       a.  x86Rewrite bug with MUL3 (found by Lal)
651       b.  Added the instructions FSTS, FSTL
652    
653    2. PA-RISC fixes/changes
654    
655       a.  B label should not be a delay slot candidate!  Why did this work?
656       b.  ADDT(32, REG(32, r), LI n) now generates one instruction instead of two,
657           as it should be.
658       c.  The assembly syntax for fstds and fstdd was wrong.
659       d.  Added the composite instruction COMICLR/LDO, which is the immediate
660           operand variant of COMCLR/LDO.
661    
662    3. Generic MLRISC
663    
664       a.  shuffle.sml rewritten to be slightly more efficient
665       b.  DIV bug in mltree-simplify fixed (found by Fermin)
666    
667    4. Register Allocator
668    
669       a.  I now release the interference graph earlier during spilling.
670           May improve memory usage.
671    
672    ----------------------------------------------------------------------
673    Name: Matthias Blume
674    Date: 2000/03/14 14:15:32
675    Tag: blume_main_v110p26p1_2
676    Description:
677    
678    1. Tools.registerStdShellCmdTool (from smlnj/cm/tool.cm) takes an
679    additional argument called "template" which is an optional string that
680    specifiel the layout of the tool command line.  See the CM manual for
681    explanation.
682    
683    2. A special-purpose tool can be "regisitered" by simply dropping the
684    corresponding <...>-tool.cm (and/or <...>-ext.cm) into the same
685    directory where the .cm file lives that uses this tool.  (The
686    behavior/misfeature until now was to look for the tool description
687    files in the current working directory.)  As before, tool description
688    files could also be anchored -- in which case they can live anywhere
689    they like.  Following the recent e-mail discussion, this change should
690    make it easier to have special-purpose tools that are shipped together
691    with the sources of the program that uses them.
692    
693    ----------------------------------------------------------------------
694    Name: Matthias Blume
695    Date: 2000/03/10 07:48:34
696    Tag: blume_main_v110p26p1_1
697    Description:
698    
699    I added a re-written version of Dave's fixpt script to src/system.
700    Changes relative to the original version:
701      - sh-ified (not everybody has ksh)
702      - automatically figures out which architecture it runs on
703      - uses ./makeml a bit more cleverly
704      - never invokes ./installml (and, thus, does not clobber your
705        good and working installation of sml in case something goes wrong)
706      - accepts max iteration count using option "-iter <n>"
707      - accepts a "base" name using option "-base <base>"
708    
709    It does not build any extraneous heap images but directly rebuilds
710    bin- and boot-hierarchies using makeml's "-rebuild" switch. Finally,
711    it can incorporate existing bin- and boot- hierarchies.  For example,
712    suppose the base is set to "sml" (which is the default).  Then it
713    successively builds
714    
715            sml.bin.<arch>-unix and sml.boot.<arch>-unix
716    then    sml1.bin.<arch>-unix and sml1.boot.<arch>-unix
717    then    sml2.bin.<arch>-unix and sml2.boot.<arch>-unix
718    ...
719    then    sml<n>.bin.<arch>-unix and sml<n>.boot.<arch>-unix
720    
721    and so on.  If any of these already exist, it will just use what's
722    there.  In particular, many people will have the initial set of bin
723    and boot files around, so this saves time for at least one full
724    rebuild.  Having sets of the form <base><k>.{bin,boot}.<arch>-unix for
725    <k>=1,2,... is normally not a good idea when invoking fixpt.  However,
726    they might be the result of an earlier partial run of fixpt (which
727    perhaps got accidentially killed).  In this case, fixpt will quickly
728    move through what exists before continuing where it left off earlier,
729    and, thus, saves a lot of time.
730    
731    ----------------------------------------------------------------------
732    Name: Allen Leung
733    Date: 00/03/10 02:20:00
734    Tag: leunga-20000310-fix_x86_asm_ra
735    Description:
736    
737    More assembly output problems involving the indexed addressing mode
738    on the x86 have been found and corrected. Thanks to Fermin Reig for the
739    fix.
740    
741    The interface and implementation of the register allocator have been changed
742    slightly to accommodate the possibility to skip the register allocation
743    phases completely and go directly to memory allocation.  This is needed
744    for C-- use.
745    
746    ----------------------------------------------------------------------
747    Name: Matthias Blume
748    Date: 00/03/09 10:23:53
749    Tag: blume_main_v110p26p1_0
750    Description:
751    
752    * Complete re-organization of library names.  Many libraries have been
753    consolidated so that they share the same path anchor.  For example,
754    all MLRISC-related libraries are anchored at MLRISC, most libraries that
755    are SML/NJ-specific are under "smlnj".  Notice that names like
756    host-cmb.cm or host-compiler.cm no longer exist.  See system/README
757    for a complete description of the new naming scheme.  Quick reference:
758    
759       host-cmb.cm        -> smlnj/cmb.cm
760       host-compiler.cm   -> smlnj/compiler.cm
761       full-cm.cm         -> smlnj/cm.cm
762       <arch>-<os>.cm     -> smlnj/cmb/<arch>-<os>.cm
763       <arch>-compiler.cm -> smlnj/compiler/<arch>.cm
764    
765    * Bug fixes in CM.
766        - exceptions in user code are being passed through (i.e., reach top level)
767        - more bugs in paranoia mode fixed
768        - bug related to checking group owners fixed
769    
770    * New install.sh script that automagically fetches archive files:
771      The new file config/srcarchiveurl must contain the URL of the
772      (remote) directory that contains bin files (or other source archives).
773      If install.sh does not find the archive locally, it tries to get
774      it from that remote directory.
775      This should simplify installation further:  For machines that have
776      access to the internet, just fetch <version>-config.tgz, unpack it,
777      edit config/targets, and go (run config/install.sh).  The scipt will
778      fetch everything else that it might need all by itself.
779    
780      For CVS users, this mechanism is not relevant for source archives, but
781      it is convenient for getting new sets of binfiles.
782    
783      Archives should be tar files compressed with either gzip, compress, or
784      bzip2.  The script recognizes .tgz, .tar, tar.gz, tz, .tar.Z, and .tar.bz2.
785    
786    ----------------------------------------------------------------------
787    Name: Matthias Blume
788    Date: 2000/03/07 04:01:04
789    Tag: blume_main_v110_26_2
790  Description:  Description:
791    - size info in BOOTLIST
792         * no fixed upper limits for number of bootfiles or length of
793           bootfile names in runtime
794         * falling back to old behavior if no BOOTLIST size info found
795    - allocation size heuristics in .run-sml
796         * tries to read cache size from /proc/cpuinfo (this is important for
797            small-cache Celeron systems!)
798    - install.sh robustified
799    - CM manual updates
800    - paranoid mode
801         * no more CMB.deliver() (i.e., all done by CMB.make())
802         * can re-use existing sml.boot.* files
803         * init.cmi now treated as library
804         * library stamps for consistency checks
805    - sml.boot.<arch>-<os>/PIDMAP file
806         * This file is read by the CM startup code.  This is used to minimize
807           the amount of dynamic state that needs to be stowed away for the
808           purpose of sharing between interactive system and user code.
809    - CM.Anchor.anchor instead of CM.Anchor.{set,cancel}
810         * Upon request by Elsa.  Anchors now controlled by get-set-pair
811           like most other CM state variables.
812    - Compiler.CMSA eliminated
813         * No longer supported by CM anyway.
814    - fixed bugs in pickler that kept biting Stefan
815         * past refs to past refs (was caused by the possibility that
816           ad-hoc sharing is more discriminating than hash-cons sharing)
817         * integer overflow on LargeInt.minInt
818    - ml-{lex,yacc} build scripts now use new mechanism
819      for building standalone programs
820    - fixed several gcc -Wall warnings that were caused by missing header
821      files, missing initializations, etc., in runtime (not all warnings
822      eliminated, though)

Legend:
Removed from v.570  
changed lines
  Added in v.628

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0