Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 575, Fri Mar 10 02:55:58 2000 UTC revision 629, Wed Apr 26 04:06:41 2000 UTC
# Line 11  Line 11 
11  Date:  Date:
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14    ----------------------------------------------------------------------
15    Name: Matthias Blume
16    Date: 2000/04/26 13:10:00 JST
17    Tag: blume-20000426-fun_suffix
18    Description:
19    
20    Added ".fun" as a recognized file name suffix (for ML code).
21    
22    ----------------------------------------------------------------------
23    Name: Allen Leung
24    Date: 2000/04/25 17:00:00 EST
25    Tag: leunga-20000425-alpha-ra
26    Description:
27    
28    1. Alpha
29    
30        PSEUDOARITH was missing in AlphaRewrite.  This causes an endless loop
31    in C--.
32    
33    2. RA
34    
35       Added a flag "ra-dump-size" to print out the size of the flowgraph
36       and the interference graph.
37    
38    ----------------------------------------------------------------------
39    Name: Dave MacQueen
40    Date: 2000/04/25/
41    Tag: dbm-20000425-mlyacc_doc_examples
42    Description:
43      Updated mlyacc.tex sections 5 and 7 for SML '97 and CM.
44      Updated all three examples in src/ml-yacc/examples to run
45      under 110.* using CM.make.
46    
47    ----------------------------------------------------------------------
48    Name: Allen Leung
49    Date: 2000/04/20 23:04:00 EST
50    Tag: leunga-20000420-ssa-c---stuff
51    Description:
52    
53      This update synchronizes my repository with Yale's.  Most of these
54    changes, however, do not affect SML/NJ at all (the RA is an exception).
55    
56    1. Register Allocator
57    
58       a. An improvement in the interference graph construction:
59          Given a copy
60    
61                s <- t
62    
63          no interference edge between s and t is added for this definition of s.
64    
65       b. I've added two new spill heuristic modules that Fermin and I developed
66          (in the new library RA.cm). These are unused in SML/NJ but maybe
67          useful for others (Moby?)
68    
69    2. X86
70    
71       a. Various fixes in the backend provided by Fermin [C--] and Lal.
72    
73    3. Alpha
74    
75       a. Added the BSR instruction and code generation that goes with it [C--]
76       b. Other fixes too numerous to recount provided by Fermin [C--]
77    
78    4. Regmaps
79    
80       a. The regmaps are not initialized with the identity physical bindings
81          at creation time.  This is unneeded.
82    
83    5. MLRISC Optimizations
84    
85       a. The DJ-Graph module can now compute the iterated dominance frontiers
86          intersects with liveness incrementally in linear time! Woohoo!
87          This is now used in my new SSA construction algorithm.
88    
89       b. THe branch reorganization module is now smarter about linear chains of
90          basic blocks.
91    
92    
93    ----------------------------------------------------------------------
94    Name: Matthias Blume
95    Date: 2000/04/12 13:52:00 JST
96    Tag: blume_main_v110p27_1
97    Description:
98    
99    Changed install.sh script to handle archive files without version number
100    and to use "boot.<arch>-<os>" instead of "sml.boot.<arch>-<os>" for the
101    name of the boot file archive.
102    
103    ----------------------------------------------------------------------
104    Name: Dave MacQueen
105    Date: 2000/04/09 14:00 EDT
106    Tag: dbm-20000410-Version_110_27
107    Description:
108    
109    1. Updated src/compiler/TopLevel/main/version.sml to version 110.27
110    
111    2. Updated src/config/version to 110.27
112    
113    3. New boot files!
114    
115    ----------------------------------------------------------------------
116    Name: Allen Leung
117    Date: 2000/04/09 19:09:00 EST
118    Tag: leunga-20000409-misc
119    Description:
120    
121    1.  Yet another fix for x86 assembly for idivl, imull, mull and friends.
122    
123    2.  Miscellaneous improvements to MLRISC (unused in sml/nj)
124    
125    ----------------------------------------------------------------------
126    Name: Stefan
127    Date: 2000/04/07 10:00:00 EDT
128    Tag: monnier-20000406-branch-handling
129    Description:
130    
131    Improved handling of branches (mostly those generated from
132    polymorphic equality), removed switchoff and changed the
133    default optimization settings (more cpsopt and less flintopt).
134    
135    ----------------------------------------------------------------------
136    Name: Allen Leung
137    Date: 2000/04/06 01:30:00 EST
138    Tag: leunga-20000406-peephole-x86-SSA-2
139    Description:
140    
141       Forgot a few files.
142    
143    ----------------------------------------------------------------------
144    Name: Allen Leung
145    Date: 2000/04/06 00:36:00 EST
146    Tag: leunga-20000406-peephole-x86-SSA
147    Description:
148    
149    1.  New Peephole code
150    
151    2.  Minor improvement to X86 instruction selection
152    
153    3.  Various fixes to SSA and machine description -> code translator
154    
155    ----------------------------------------------------------------------
156    Name: Matthias Blume
157    Date: 2000/04/05 12:30:00 JST
158    Tag: blume_main_v110p26p2_3
159    Description:
160    
161    This update just merges three minor cosmetic updates to CM's sources
162    to get ready for the 110.27 code freeze on Friday.  No functionality
163    has changed.
164    
165    ----------------------------------------------------------------------
166    Name: Allen Leung
167    Date: 2000/04/04 19:39:00 EST
168    Tag: leunga-20000404-x86-asm
169    Description:
170    
171    1.  Fixed a problem in X86 assembly.
172    
173        Things like
174    
175           jmp %eax
176           jmp (%eax)
177    
178        should be output as
179    
180           jmp *%eax
181           jmp *(%eax)
182    
183    2.  Assembly output
184    
185          Added a new flag
186    
187              "asm-indent-copies" (default to false)
188    
189          When this flag is on, parallel copies will be indented an extra level.
190    
191    ----------------------------------------------------------------------
192    Name: Allen Leung
193    Date: 2000/04/04 03:18:00 EST
194    Tag: leunga-20000404-C--Moby
195    Description:
196    
197        All of these fixes are related to C--, Moby, and my own optimization
198        stuff; so they shouldn't affect SML/NJ.
199    
200    1.  X86
201    
202        Various fixes related floating point, and extensions.
203    
204    2.  Alpha
205    
206        Some extra patterns related to loads with signed/zero extension
207        provided by Fermin.
208    
209    3.  Assembly
210    
211        When generating assemby, resolve the value of client defined constants,
212        instead of generating symbolic values.  This is controlled by the
213        new flag "asm-resolve-constants", which is default to true.
214    
215    4.  Machine Descriptions
216    
217        a. The precedence parser was slightly broken when parsing infixr symbols.
218        b. The type generalizing code had the bound variables reversed, resulting
219           in a problem during arity raising.
220        c. Various fixes in machine descriptions.
221    
222    ----------------------------------------------------------------------
223    Name: Matthias Blume
224    Date: 2000/04/03 16:05:00 JST
225    Tag: blume_main_v110p26p2_2
226    Description:
227    
228    I eliminated coreEnv from compInfo.  Access to the "Core" structure is
229    now done via the ordinary static environment that is context to each
230    compilation unit.
231    
232    To this end, I arranged that instead of "structure Core" as "structure
233    _Core" is bound in the pervasive environment.  Core access is done via
234    _Core (which can never be accidentially rebound because _Core is not a
235    legal surface-syntax symbol).
236    
237    The current solution is much cleaner because the core environment is
238    now simply part of the pervasive environment which is part of every
239    compilation unit's context anyway.  In particular, this eliminates all
240    special-case handling that was necessary until now in order to deal
241    with dynamic and symbolic parts of the core environment.
242    
243    Remaining hackery (to bind the "magic" symbol _Core) is localized in the
244    compilation mananger's bootstrap compiler (actually: in the "init group"
245    handling).  See the comments in src/system/smlnj/init/init.cmi for
246    more details.
247    
248    I also tried to track down all mentions of "Core" (as string argument
249    to Symbol.strSymbol) in the compiler and replaced them with a
250    reference to the new CoreSym.coreSym.  Seems cleaner since the actual
251    name appears in one place only.
252    
253    Binfile and bootfile format have not changed, but the switchover from
254    the old "init.cmi" to the new one is a bit tricky, so I supplied new
255    bootfiles anyway.
256    
257    ----------------------------------------------------------------------
258    Name: Allen Leung
259    Date: 2000/04/02 21:17:00 EST
260    Tag: leunga-20000402-mltree
261    Description:
262    
263       1. Renamed the constructor CALL in MLTREE by popular demand.
264       2. Added a bunch of files from my repository.  These are currently
265          used by other non-SMLNJ backends.
266    
267    ----------------------------------------------------------------------
268    Name: Allen Leung
269    Date: 2000/03/31 21:15:00 EST
270    Tag: leunga-20000331-aliasing
271    Description:
272    
273    This update contains a rewritten (and hopefully more correct) module
274    for extracting aliasing information from CPS.
275    
276       To turn on this feature:
277    
278            Compiler.Control.CG.memDisambiguate := true
279    
280       To pretty print the region information with assembly
281    
282           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
283    
284       To control how many levels of aliasing information are printed, use:
285    
286           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
287    
288       The default of n is 3.
289    
290    ----------------------------------------------------------------------
291    Name: David MacQueen
292    Date: 2000/03/31 11:15:00 EST
293    Tag: dbm-20000331-runtime_fix
294    Description:
295    
296    This update contains:
297    
298    1. runtime/c-lib/c-libraries.c
299       includes added in revision 1.2 caused compilation errors on hppa-hpux
300    
301    2. fix for bug 1556
302       system/Basis/Implementation/NJ/internal-signals.sml
303    
304    ----------------------------------------------------------------------
305    Name: Matthias Blume
306    Date: 2000/03/31 18:00:00 JST
307    Tag: blume_main_v110p26p2_1
308    Description:
309    
310    This update contains:
311    
312    1. A small change to CM's handling of stable libraries:
313       CM now maintains one "global" modmap that is used for all stable
314       libraries.  The use of such a global modmap maximizes sharing and
315       minimizes the need for re-traversing parts of environments during
316       modmap construction.  (However, this has minor impact since modmap
317       construction seems to account for just one percent or less of total
318       compile time.)
319    
320    2. I added a "genmap" phase to the statistics.  This is where I got the
321       "one percent" number (see above).
322    
323    3. CM's new tool parameter mechanism just became _even_ better. :)
324       - The parser understands named parameters and recursive options.
325       - The "make" and "shell" tools use these new features.
326         (This makes it a lot easier to cascade these tools.)
327       - There is a small syntax change: named parameters use a
328    
329           <name> : ( <option> ... )            or
330           <name> : <string>
331    
332         syntax.  Previously, named parameters were implemented in an
333         ad-hoc fashion by each tool individually (by parsing strings)
334         and had the form
335    
336           <name>=<string>
337    
338       See the CM manual for a full description of these issues.
339    
340    ----------------------------------------------------------------------
341    Name: Matthias Blume
342    Date: 2000/03/30 18:00:00 JST
343    Tag: blume_main_v110p26p2_0
344    Description:
345    
346    !!!!! WARNING !!!!!!
347    !!  New binfiles  !!
348    !!!!!!!!!!!!!!!!!!!!
349    
350    This update contains:
351    
352    1. Moderate changes to CM:
353    
354       - Changes to CM's tools mechanism.  In particular, it is now possible
355       to have tools that accept additional "command line" parameters
356       (specified in the .cm file at each instance where the tool's class is
357       used).
358    
359       This was done to accomodate the new "make" and "shell" tools which
360       facilitate fairly seemless hookup to portions of code managed using
361       Makefiles or Shell scripts.
362    
363       There are no classes "shared" or "private" anymore.  Instead, the
364       sharing annotation is now a parameter to the "sml" class.
365    
366       There is a bit of generic machinery for implementing one's own
367       tools that accept command-line parameters.  However, I am not yet fully
368       satisfied with that part, so expect changes here in the future.
369    
370       All existing tools are described in the CM manual.
371    
372       - Slightly better error handling.  (CM now surpresses many followup
373       error messages that tended to be more annoying than helpful.)
374    
375    2. Major changes to the compiler's static environment data structures.
376    
377       - no CMStaticEnv anymore.
378            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
379              but it is called Environment), no conversions between different
380              kinds of static environments
381    
382       - There is still a notion of a "modmap", but such modmaps are generated
383         on demand at the time when they are needed.  This sounds slow, but I
384         sped up the code that generates modmaps enough for this not to lead to
385         a slowdown of the compiler (at least I didn't detect any).
386    
387       - To facilitate rapid modmap generation, static environments now
388         contain an (optional) "modtree" structure.  Modtree annotations are
389         constructed by the unpickler during unpickling.  (This means that
390         the elaborator does not have to worry about modtrees at all.)
391         Modtrees have the advantage that they are compositional in the same
392         way as the environment data structure itself is compositional.
393         As a result, modtrees never hang on to parts of an environment that
394         has already been rendered "stale" by filtering or rebinding.
395    
396       - I went through many, many trials and errors before arriving at the
397         current solution.  (The initial idea of "linkpaths" did not work.)
398         But the result of all this is that I have touched a lot of files that
399         depend on the "modules" and "types" data structures (most of the
400         elaborator). There were a lot of changes during my "linkpath" trials
401         that could have been reverted to their original state but weren't.
402         Please, don't be too harsh on me for messing with this code a bit more
403         than what was strictly necessary...  (I _did_ resist the tempation
404         of doing any "global reformatting" to avoid an untimely death at
405         Dave's hands. :)
406    
407       - One positive aspect of the previous point:  At least I made sure that
408         all files that I touched now compile without warnings (other than
409         "polyEqual").
410    
411       - compiler now tends to run "leaner" (i.e., ties up less memory in
412         redundant modmaps)
413    
414    ----------------------------------------------------------------------
415    Name: Allen Leung
416    Date: 2000/03/29 18:00:00
417    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
418    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
419    Description:
420    
421       This update contains *MAJOR* changes to the way code is generated from CPS
422    in the module mlriscGen, and in various backend modules.
423    
424    CHANGES
425    =======
426    
427    1. MLRiscGen: forward propagation fix.
428    
429       There was a bug in forward propagation introduced at about the same time
430       as the MLRISC x86 backend, which prohibits coalescing to be
431       performed effectively in loops.
432    
433       Effect: speed up of loops in RISC architectures.
434               By itself, this actually slowed down certain benchmarks on the x86.
435    
436    2. MLRiscGen:  forward propagating addresses from consing.
437    
438       I've changed the way consing code is generated.  Basically I separated
439       out the initialization part:
440    
441            store tag,   offset(allocptr)
442            store elem1, offset+4(allocptr)
443            store elem2, offset+8(allocptr)
444            ...
445            store elemn, offset+4n(allocptr)
446    
447       and the address computation part:
448    
449            celladdr <- offset+4+alloctpr
450    
451       and move the address computation part
452    
453       Effect:  register pressure is generally lower as a result.  This
454                makes compilation of certain expressions much faster, such as
455                long lists with non-trivial elements.
456    
457                 [(0,0), (0,0), .... (0,0)]
458    
459    3. MLRiscGen: base pointer elimination.
460    
461        As part of the linkage mechanism, we generate the sequence:
462    
463         L:  ...  <- start of the code fragment
464    
465         L1:
466             base pointer <- linkreg - L1 + L
467    
468         The base pointer was then used for computing relocatable addresses
469       in the code fragment.  Frequently (such as in lots of continuations)
470       this is not needed.  We now eliminate this sequence whenever possible.
471    
472         For compile time efficiency, I'm using a very stupid local heuristic.
473       But in general, this should be done as a control flow analysis.
474    
475       Effect:  Smaller code size.  Speed up of most programs.
476    
477    4. Hppa back end
478    
479         Long jumps in span dependence resolution used to depend on the existence
480      of the base pointer.
481    
482         A jump to a long label L was expanded into the following sequence:
483    
484          LDIL %hi(L-8192), %r29
485          LDO  %lo(L-8192)(%r29), %r29
486          ADD  %r29, baseptr, %r29
487          BV,n %r0(%r29)
488    
489         In the presence of change (3) above, this will not work.  I've changed
490       it so that the following sequence of instructions are generated, which
491       doesn't mention the base pointer at all:
492    
493             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
494        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
495             BV,n  %r0(%r29)          /* Jump */
496    
497    5. Alpha back end
498    
499          New alpha instructions LDB/LDW have been added, as per Fermin's
500       suggestions.   This is unrelated to all other changes.
501    
502    6. X86 back end
503    
504         I've changed andl to testl in the floating point test sequence
505         whenever appropriate.  The Intel optimization guide states that
506         testl is perferable to andl.
507    
508    7. RA (x86 only)
509    
510         I've improved the spill propagation algorithm, using an approximation
511       of maximal weighted independent sets.   This seems to be necessary to
512       alleviate the negative effect in light of the slow down in (1).
513    
514         I'll write down the algorithm one of these days.
515    
516    8. MLRiscGen: frequencies
517    
518         I've added an annotation that states that all call gc blocks have zero
519       execution frequencies.  This improves register allocation on the x86.
520    
521    BENCHMARKS
522    ==========
523    
524       I've only perform the comparison on 110.25.
525    
526       The platforms are:
527    
528        HPPA  A four processor HP machine (E9000) with 5G of memory.
529        X86   A 300Hhz Pentium II with 128M of memory, and
530        SPARC An Ultra sparc 2 with 512M of memory.
531    
532       I used the following parameters for the SML benchmarks:
533    
534                 @SMLalloc
535         HPPA    256k
536         SPARC   512k
537         X86     256k
538    
539    COMPILATION TIME
540    ----------------
541       Here are the numbers comparing the compilation times of the compilers.
542       I've only compared 110.25 compiling the new sources versus
543       a fixpoint version of the new compiler compiling the same.
544    
545                     110.25                                  New
546               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
547         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
548         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
549         X86    999s    315s       94006+130691   987s    296s    108877+141957
550    
551                   110.25         New
552                Code Size      Code Size
553         HPPA   8596736         8561421
554         SPARC  8974299         8785143
555         X86    9029180         8716783
556    
557       So in summary, things are at least as good as before.   Dramatic
558       reduction in compilation is obtained on the Sparc; I can't explain it,
559       but it is reproducible.  Perhaps someone should try to reproduce this
560       on their own machines.
561    
562    SML BENCHMARKS
563    --------------
564    
565        On the average, all benchmarks perform at least as well as before.
566    
567          HPPA         Compilation Time     Spill+Reload      Run Time
568                     110.25  New            110.25    New   110.25  New
569    
570          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
571              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
572       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
573                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
574        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
575             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
576               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
577              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
578         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
579             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
580            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
581      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
582                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
583             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
584                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
585               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
586      --------------------------------------------------------------------------
587       Average                     6.12%                                   4.09%
588    
589          SPARC        Compilation Time     Spill+Reload      Run Time
590                     110.25  New            110.25    New   110.25  New
591    
592          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
593              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
594       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
595                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
596        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
597             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
598               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
599              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
600         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
601             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
602            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
603      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
604                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
605             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
606                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
607               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
608      ----------------------------------------------------------------------------
609       Average                     6.94%                                    2.64%
610    
611          X86          Compilation Time     Spill+Reload      Run Time
612                     110.25  New            110.25    New   110.25  New
613    
614          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
615              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
616       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
617                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
618        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
619             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
620               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
621              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
622         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
623             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
624            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
625      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
626                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
627             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
628                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
629               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
630      ----------------------------------------------------------------------------
631       Average                     1.22%                                     3.36%
632    
633    ----------------------------------------------------------------------
634    Name: Allen Leung
635    Date: 2000/03/23 16:25:00
636    Tag: leunga-20000323-fix_x86_alpha
637    Description:
638    
639    1. X86 fixes/changes
640    
641       a.  The old code generated for SETcc was completely wrong.
642           The Intel optimization guide is VERY misleading.
643    
644    2. ALPHA fixes/changes
645    
646       a.  Added the instructions LDBU, LDWU, STB, STW as per Fermin's suggestion.
647       b.  Added a new mode byteWordLoadStores to the functor parameter to Alpha()
648       c.  Added reassociation code for address computation.
649    
650    ----------------------------------------------------------------------
651    Name: Allen Leung
652    Date: 2000/03/22 01:23:00
653    Tag: leunga-20000322-fix_x86_hppa_ra
654    Description:
655    
656    1. X86 fixes/changes
657    
658       a.  x86Rewrite bug with MUL3 (found by Lal)
659       b.  Added the instructions FSTS, FSTL
660    
661    2. PA-RISC fixes/changes
662    
663       a.  B label should not be a delay slot candidate!  Why did this work?
664       b.  ADDT(32, REG(32, r), LI n) now generates one instruction instead of two,
665           as it should be.
666       c.  The assembly syntax for fstds and fstdd was wrong.
667       d.  Added the composite instruction COMICLR/LDO, which is the immediate
668           operand variant of COMCLR/LDO.
669    
670    3. Generic MLRISC
671    
672       a.  shuffle.sml rewritten to be slightly more efficient
673       b.  DIV bug in mltree-simplify fixed (found by Fermin)
674    
675    4. Register Allocator
676    
677       a.  I now release the interference graph earlier during spilling.
678           May improve memory usage.
679    
680    ----------------------------------------------------------------------
681    Name: Matthias Blume
682    Date: 2000/03/14 14:15:32
683    Tag: blume_main_v110p26p1_2
684    Description:
685    
686    1. Tools.registerStdShellCmdTool (from smlnj/cm/tool.cm) takes an
687    additional argument called "template" which is an optional string that
688    specifiel the layout of the tool command line.  See the CM manual for
689    explanation.
690    
691    2. A special-purpose tool can be "regisitered" by simply dropping the
692    corresponding <...>-tool.cm (and/or <...>-ext.cm) into the same
693    directory where the .cm file lives that uses this tool.  (The
694    behavior/misfeature until now was to look for the tool description
695    files in the current working directory.)  As before, tool description
696    files could also be anchored -- in which case they can live anywhere
697    they like.  Following the recent e-mail discussion, this change should
698    make it easier to have special-purpose tools that are shipped together
699    with the sources of the program that uses them.
700    
701    ----------------------------------------------------------------------
702    Name: Matthias Blume
703    Date: 2000/03/10 07:48:34
704    Tag: blume_main_v110p26p1_1
705    Description:
706    
707    I added a re-written version of Dave's fixpt script to src/system.
708    Changes relative to the original version:
709      - sh-ified (not everybody has ksh)
710      - automatically figures out which architecture it runs on
711      - uses ./makeml a bit more cleverly
712      - never invokes ./installml (and, thus, does not clobber your
713        good and working installation of sml in case something goes wrong)
714      - accepts max iteration count using option "-iter <n>"
715      - accepts a "base" name using option "-base <base>"
716    
717    It does not build any extraneous heap images but directly rebuilds
718    bin- and boot-hierarchies using makeml's "-rebuild" switch. Finally,
719    it can incorporate existing bin- and boot- hierarchies.  For example,
720    suppose the base is set to "sml" (which is the default).  Then it
721    successively builds
722    
723            sml.bin.<arch>-unix and sml.boot.<arch>-unix
724    then    sml1.bin.<arch>-unix and sml1.boot.<arch>-unix
725    then    sml2.bin.<arch>-unix and sml2.boot.<arch>-unix
726    ...
727    then    sml<n>.bin.<arch>-unix and sml<n>.boot.<arch>-unix
728    
729    and so on.  If any of these already exist, it will just use what's
730    there.  In particular, many people will have the initial set of bin
731    and boot files around, so this saves time for at least one full
732    rebuild.  Having sets of the form <base><k>.{bin,boot}.<arch>-unix for
733    <k>=1,2,... is normally not a good idea when invoking fixpt.  However,
734    they might be the result of an earlier partial run of fixpt (which
735    perhaps got accidentially killed).  In this case, fixpt will quickly
736    move through what exists before continuing where it left off earlier,
737    and, thus, saves a lot of time.
738    
739    ----------------------------------------------------------------------
740    Name: Allen Leung
741    Date: 00/03/10 02:20:00
742    Tag: leunga-20000310-fix_x86_asm_ra
743    Description:
744    
745    More assembly output problems involving the indexed addressing mode
746    on the x86 have been found and corrected. Thanks to Fermin Reig for the
747    fix.
748    
749    The interface and implementation of the register allocator have been changed
750    slightly to accommodate the possibility to skip the register allocation
751    phases completely and go directly to memory allocation.  This is needed
752    for C-- use.
753    
754  ----------------------------------------------------------------------  ----------------------------------------------------------------------
755  Name: Matthias Blume  Name: Matthias Blume

Legend:
Removed from v.575  
changed lines
  Added in v.629

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0