Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 583, Thu Mar 23 21:52:30 2000 UTC revision 648, Thu May 25 21:28:21 2000 UTC
# Line 13  Line 13 
13  Description:  Description:
14  ----------------------------------------------------------------------  ----------------------------------------------------------------------
15  Name: Allen Leung  Name: Allen Leung
16    Date: 2000/05/25 17:28 EDT
17    Tag: leunga-20000525-ra
18    Description:
19    
20      Fixed a bug in freezing phase of the register allocator.
21    
22    ----------------------------------------------------------------------
23    Name: Allen Leung
24    Date: 2000/05/15 22:53 EDT
25    Tag: leunga-20000515-alpha-x86-ra
26    Description:
27    
28      1. Alpha
29    
30          Slight cleanup.  Removed the instruction SGNXL
31    
32      2. X86
33    
34          Added the following instructions to the instruction set:
35    
36            ROLx, RORx,
37            BTx, BTSx, BTLx, BTRx,
38            XCHGx, and variants with the LOCK prefix
39    
40      3. Register Allocation
41    
42          The module ra-rewrite-with-renaming has been improved.
43    
44      These have no effect on SML/NJ.
45    
46    ----------------------------------------------------------------------
47    Name: Matthias Blume
48    Date: 2000/05/15 16:20:00 JST
49    Tag: blume-20000515-lightrebuild
50    Description:
51    
52    1. I added an alternative to "-rebuild" to "makeml".  The difference is
53       that prior to calling CMB.make' the CM-variable "LIGHT" will be
54       defined.  In effect, the command will not build any cross-compiler
55       backends and therefore finish more quickly.
56    
57       The "fixpt" script also takes a "-light" switch to be able to use
58       this new facility while compiling for a fixpoint.
59    
60    2. I replaced all mentions of anchored paths in group owner specifications
61       with simple relative paths (usually starting with "..").
62       The rationale is that a library's internal workings should not be
63       compromised by the lack of some anchor.  (An anchor is necessary
64       for someone who wants to refer to the library by an anchored path,
65       but it should not be necessary to build the same library in the first
66       place.)
67    
68    3. I changed the way CM's tool mechanism determines the shell command
69       string used for things like ml-yacc etc. so that it does not break
70       when CM.Control.implicit_anchors is turned off.
71    
72    ----------------------------------------------------------------------
73    Name: Matthias Blume
74    Date: 2000/05/12 18:20:00 JST
75    Tag: blume-20000512-ml-build
76    Description:
77    
78    Fixed a bug in config/_ml-build that prevented ml-yacc and ml-lex from
79    getting installed properly (by config/install.sh).
80    
81    ----------------------------------------------------------------------
82    Name: Matthias Blume
83    Date: 2000/05/12 17:30:00 JST
84    Tag: blume-20000512-anchors
85    Description:
86    
87    !!! NEW BOOT FILES !!!
88    
89    This change is in preparation of fading out support for "implicitly
90    anchored path names".  I went through all sources and used the
91    explicit (and relatively new) $-notation.  See system/README and the
92    CM manual for more info on this.
93    
94    I also modified the anchoring scheme for some things such as "smlnj",
95    "MLRISC", "cm", etc. to take advantage of the fact that explicit
96    anchors are more expressive: anchor name and first arc do not have to
97    coincide.  This entails the following user-visible change:
98    
99    You have to write $smlnj/foo/bar instead of smlnj/foo/bar.  In
100    particular, when you fire up sml with a command-line argument, say,
101    e.g.:
102    
103       sml '$smlnj/cmb.cm'
104    
105    At the ML toplevel prompt:
106    
107       CM.autoload "$smlnj/cmb.cm";
108    
109    There is also a new controller in CM.Control that can be used to turn
110    off all remaining support for implicit anchors by saying:
111    
112        CM.autoload "$smlnj/
113        #set CM.Control.implicit_anchors false;
114    
115    This causes CM to reject implicitly anchored paths.  This is (for the
116    time being) less permissive than the "final" version where there will
117    be no more such implicit anchors and relative paths will be just that:
118    relative.
119    
120    The next step (version after next version?) will be to make the
121    default for CM.Control.implicit_anchors false.  After the dust has
122    settled, I can then produce the "final" version of this...
123    
124    Note: Since bootstrapping is a bit tricky, I provided new boot files.
125    
126    ----------------------------------------------------------------------
127    Name: Matthias Blume
128    Date: 2000/05/11 16:30:00 JST
129    Tag: blume-20000511-sources
130    Description:
131    
132    The main change is that I added function CM.sources as a generalized
133    version of the earlier CM.makedepend.  This entails the following
134    additional changes:
135    
136      - CM.makedepend has been dropped.
137    
138      - CM manual has been updated.
139    
140      - TOOLS signature and API have been changed.
141    
142    ----------------------------------------------------------------------
143    Name: Allen Leung
144    Date: 2000/05/10 21:17 EDT
145    Tag: leunga-20000510-moby-c--ssa
146    Description:
147    
148      Various bug fixes and new features for C--, Moby and MLRISC optimizations.
149    None of these affect SML/NJ.
150    
151    1. Register Allocation
152    
153        a. A new ra spilling module (ra/ra-spill-with-renaming) is implemented.
154           This module tries to remove local (i.e. basic block level) redundancies
155           during spilling.
156    
157        b. A new framework for performing region based register allocation.
158           Not yet entirely functional.
159    
160    2. X86
161    
162       a. DefUse for POP was missing the stack pointer [found by Lal]
163       b. Reload for CALL was incorrect in X86Spill [found by John]
164       c. Various fixes in X86Spill so that it can be used correctly for
165          the new spilling module.
166    
167    3. SSA/IR
168    
169       a. New module ir/dj-dataflow.sml implements elimination based
170          data flow analysis.
171    
172    4. MLRiscGen
173    
174       a. Fix for gc type annotation
175    
176    5. MDGen
177    
178       Various fixes for machine description -> ml code translation.  For ssa
179       only.
180    
181    ----------------------------------------------------------------------
182    Name: Allen Leung
183    Date: 2000/05/08 22:17 EDT
184    Tag: leunga-20000508-labexp
185    Description:
186    
187      Fermin has found a few assembly problems with constant expressions
188      generated in LabelExp.  Mostly, the problems involve extra parentheses,
189      which choke on dumb assemblers.  This is his fix.
190    
191    ----------------------------------------------------------------------
192    Name: Dave MacQueen
193    Date: 2000/04/09 14:00 EDT
194    Tag: dbm-20000502-Version_110_28
195    Description:
196    
197    1. Updated src/compiler/TopLevel/main/version.sml to version 110.28
198    
199    2. Updated config/version to 110.28
200    
201    3. Updated config/srcarchiveurl
202    
203    3. New boot files!
204       ftp://ftp.research.bell-labs.com/dist/smlnj/working/110.28/
205    
206    ----------------------------------------------------------------------
207    Name: Matthias Blume
208    Date: 2000/05/01 19:05:00 JST
209    Tag: blume-20000501-noweb
210    Description:
211    
212    A new noweb tool has been added.  The existing system is entirely
213    unaffected by this, but some CM users have asked for renewed noweb
214    support.  Everything is documented in the CM manual.
215    
216    New (plugin) libraries:
217    
218       noweb-tool.cm
219       nw-ext.cm
220    
221    ----------------------------------------------------------------------
222    Name: Dave MacQueen
223    Date: 2000/04/30 12:40PM EDT
224    Tag: dbm-20000430-bug_fixes
225    Description:
226    
227    1. Fix for bug 1498
228       smlnj/src/system/Basis/Implementation/Unsafe/object.sig
229       smlnj/src/system/Basis/Implementation/Unsafe/object.sml
230         added toRealArray function
231       smlnj/src/compiler/MiscUtil/print/ppobj.sml
232         added check for tag Obj.RealArray to array printing case in ppObj
233    
234    2. Fix for bug 1510
235       smlnj/src/compiler/Semant/types/typesutil.sml
236         fixed definition of dummyargs (used by equalTycon) so that
237         dummy args are distinct types
238    
239    ----------------------------------------------------------------------
240    Name: Matthias Blume
241    Date: 2000/04/30 01:00:00 JST
242    Tag: blume-20000430-versions
243    Description:
244    
245    1. CM version numbering added.  This is an implementation of Lal's
246       proposal for adding version numbers and version checking to .cm
247       files.  Lal said that his proposal was just that -- a proposal.
248       For the time being I went ahead and implemented it so that people
249       can comment on it.  Everything is completely backward-compatible
250       (except for the stable library format, i.e., new bootfiles!).
251    
252       As usual, see the CM manual for details.
253    
254    2. An alternative syntax for anchored paths has been implemented.
255       Dave has recently voiced the same concerns that I had when I did
256       this, so there should be some support.  My take is that eventually
257       I will let support for the current syntax (where anchors are
258       "implicit") fade out in favor of the new, explicit syntax.
259       In order to be backward-compatible, both old and new syntax are
260       currently supported.
261    
262       Again, see the CM manual for details.
263    
264    3. Parallel make is trying to be slightly smarter:  When the master
265       process finds a "bottleneck", i.e., when there is only one
266       compilation unit that can be compiled and everybody else is
267       waiting on it, then it will simply compile it directly instead
268       of clumsily telling one of the slaves to do it.
269    
270    4. Support for "unsharing" added.  This is necessary in order to be
271       able to have two different versions of the same library running
272       at the same time (e.g., for trying out a new MLRISC while still
273       having the old MLRISC linked into the current compiler, etc.)
274       See the CM manual.
275    
276    5. Simple "makedepend" functionality added for generating Makefile
277       dependency information.  (This is rather crude at the moment.
278       Expect some changes here in the future.)
279    
280    6. ".fun" added as a recognized suffix for ML files. Also documented
281       explicitly in the manual that the fallback behavior (unknown suffix
282       -> ML file) is not an official feature!
283    
284    7. Small changes to the pickler for stable libraries.
285    
286    8. Several internal changes to CM (for cleanup/improvement).
287    
288    
289    !!!! NEW BINFILES !!!!
290    
291    ----------------------------------------------------------------------
292    Name: Matthias Blume
293    Date: 2000/04/28 17:30:00 JST
294    Tag: blume-20000428-pathconfig
295    Description:
296    
297    1. I changed config/install.sh to remove duplicate entries from the
298       lib/pathconfig file at the end.  Moreover, the final version of
299       lib/pathconfig is sorted alphabetically.  The same (sorting) is done
300       in src/system/installml.
301    
302    2. The config/install.sh script now consistently uses relative
303       pathnames in lib/pathconfig whenever the anchor is in the lib
304       directory.  (So far this was true for the libraries that come
305       pre-compiled and bundled as part of the bootfiles but not for
306       libraries that are compiled by the script itself.)
307    
308    ----------------------------------------------------------------------
309    Name: Matthias Blume
310    Date: 2000/04/26 13:10:00 JST
311    Tag: blume-20000426-fun_suffix
312    Description:
313    
314    Added ".fun" as a recognized file name suffix (for ML code).
315    
316    ----------------------------------------------------------------------
317    Name: Allen Leung
318    Date: 2000/04/25 17:00:00 EST
319    Tag: leunga-20000425-alpha-ra
320    Description:
321    
322    1. Alpha
323    
324        PSEUDOARITH was missing in AlphaRewrite.  This causes an endless loop
325    in C--.
326    
327    2. RA
328    
329       Added a flag "ra-dump-size" to print out the size of the flowgraph
330       and the interference graph.
331    
332    ----------------------------------------------------------------------
333    Name: Dave MacQueen
334    Date: 2000/04/25/
335    Tag: dbm-20000425-mlyacc_doc_examples
336    Description:
337      Updated mlyacc.tex sections 5 and 7 for SML '97 and CM.
338      Updated all three examples in src/ml-yacc/examples to run
339      under 110.* using CM.make.
340    
341    ----------------------------------------------------------------------
342    Name: Allen Leung
343    Date: 2000/04/20 23:04:00 EST
344    Tag: leunga-20000420-ssa-c---stuff
345    Description:
346    
347      This update synchronizes my repository with Yale's.  Most of these
348    changes, however, do not affect SML/NJ at all (the RA is an exception).
349    
350    1. Register Allocator
351    
352       a. An improvement in the interference graph construction:
353          Given a copy
354    
355                s <- t
356    
357          no interference edge between s and t is added for this definition of s.
358    
359       b. I've added two new spill heuristic modules that Fermin and I developed
360          (in the new library RA.cm). These are unused in SML/NJ but maybe
361          useful for others (Moby?)
362    
363    2. X86
364    
365       a. Various fixes in the backend provided by Fermin [C--] and Lal.
366    
367    3. Alpha
368    
369       a. Added the BSR instruction and code generation that goes with it [C--]
370       b. Other fixes too numerous to recount provided by Fermin [C--]
371    
372    4. Regmaps
373    
374       a. The regmaps are not initialized with the identity physical bindings
375          at creation time.  This is unneeded.
376    
377    5. MLRISC Optimizations
378    
379       a. The DJ-Graph module can now compute the iterated dominance frontiers
380          intersects with liveness incrementally in linear time! Woohoo!
381          This is now used in my new SSA construction algorithm.
382    
383       b. THe branch reorganization module is now smarter about linear chains of
384          basic blocks.
385    
386    
387    ----------------------------------------------------------------------
388    Name: Matthias Blume
389    Date: 2000/04/12 13:52:00 JST
390    Tag: blume_main_v110p27_1
391    Description:
392    
393    Changed install.sh script to handle archive files without version number
394    and to use "boot.<arch>-<os>" instead of "sml.boot.<arch>-<os>" for the
395    name of the boot file archive.
396    
397    ----------------------------------------------------------------------
398    Name: Dave MacQueen
399    Date: 2000/04/09 14:00 EDT
400    Tag: dbm-20000410-Version_110_27
401    Description:
402    
403    1. Updated src/compiler/TopLevel/main/version.sml to version 110.27
404    
405    2. Updated src/config/version to 110.27
406    
407    3. New boot files!
408    
409    ----------------------------------------------------------------------
410    Name: Allen Leung
411    Date: 2000/04/09 19:09:00 EST
412    Tag: leunga-20000409-misc
413    Description:
414    
415    1.  Yet another fix for x86 assembly for idivl, imull, mull and friends.
416    
417    2.  Miscellaneous improvements to MLRISC (unused in sml/nj)
418    
419    ----------------------------------------------------------------------
420    Name: Stefan
421    Date: 2000/04/07 10:00:00 EDT
422    Tag: monnier-20000406-branch-handling
423    Description:
424    
425    Improved handling of branches (mostly those generated from
426    polymorphic equality), removed switchoff and changed the
427    default optimization settings (more cpsopt and less flintopt).
428    
429    ----------------------------------------------------------------------
430    Name: Allen Leung
431    Date: 2000/04/06 01:30:00 EST
432    Tag: leunga-20000406-peephole-x86-SSA-2
433    Description:
434    
435       Forgot a few files.
436    
437    ----------------------------------------------------------------------
438    Name: Allen Leung
439    Date: 2000/04/06 00:36:00 EST
440    Tag: leunga-20000406-peephole-x86-SSA
441    Description:
442    
443    1.  New Peephole code
444    
445    2.  Minor improvement to X86 instruction selection
446    
447    3.  Various fixes to SSA and machine description -> code translator
448    
449    ----------------------------------------------------------------------
450    Name: Matthias Blume
451    Date: 2000/04/05 12:30:00 JST
452    Tag: blume_main_v110p26p2_3
453    Description:
454    
455    This update just merges three minor cosmetic updates to CM's sources
456    to get ready for the 110.27 code freeze on Friday.  No functionality
457    has changed.
458    
459    ----------------------------------------------------------------------
460    Name: Allen Leung
461    Date: 2000/04/04 19:39:00 EST
462    Tag: leunga-20000404-x86-asm
463    Description:
464    
465    1.  Fixed a problem in X86 assembly.
466    
467        Things like
468    
469           jmp %eax
470           jmp (%eax)
471    
472        should be output as
473    
474           jmp *%eax
475           jmp *(%eax)
476    
477    2.  Assembly output
478    
479          Added a new flag
480    
481              "asm-indent-copies" (default to false)
482    
483          When this flag is on, parallel copies will be indented an extra level.
484    
485    ----------------------------------------------------------------------
486    Name: Allen Leung
487    Date: 2000/04/04 03:18:00 EST
488    Tag: leunga-20000404-C--Moby
489    Description:
490    
491        All of these fixes are related to C--, Moby, and my own optimization
492        stuff; so they shouldn't affect SML/NJ.
493    
494    1.  X86
495    
496        Various fixes related floating point, and extensions.
497    
498    2.  Alpha
499    
500        Some extra patterns related to loads with signed/zero extension
501        provided by Fermin.
502    
503    3.  Assembly
504    
505        When generating assemby, resolve the value of client defined constants,
506        instead of generating symbolic values.  This is controlled by the
507        new flag "asm-resolve-constants", which is default to true.
508    
509    4.  Machine Descriptions
510    
511        a. The precedence parser was slightly broken when parsing infixr symbols.
512        b. The type generalizing code had the bound variables reversed, resulting
513           in a problem during arity raising.
514        c. Various fixes in machine descriptions.
515    
516    ----------------------------------------------------------------------
517    Name: Matthias Blume
518    Date: 2000/04/03 16:05:00 JST
519    Tag: blume_main_v110p26p2_2
520    Description:
521    
522    I eliminated coreEnv from compInfo.  Access to the "Core" structure is
523    now done via the ordinary static environment that is context to each
524    compilation unit.
525    
526    To this end, I arranged that instead of "structure Core" as "structure
527    _Core" is bound in the pervasive environment.  Core access is done via
528    _Core (which can never be accidentially rebound because _Core is not a
529    legal surface-syntax symbol).
530    
531    The current solution is much cleaner because the core environment is
532    now simply part of the pervasive environment which is part of every
533    compilation unit's context anyway.  In particular, this eliminates all
534    special-case handling that was necessary until now in order to deal
535    with dynamic and symbolic parts of the core environment.
536    
537    Remaining hackery (to bind the "magic" symbol _Core) is localized in the
538    compilation mananger's bootstrap compiler (actually: in the "init group"
539    handling).  See the comments in src/system/smlnj/init/init.cmi for
540    more details.
541    
542    I also tried to track down all mentions of "Core" (as string argument
543    to Symbol.strSymbol) in the compiler and replaced them with a
544    reference to the new CoreSym.coreSym.  Seems cleaner since the actual
545    name appears in one place only.
546    
547    Binfile and bootfile format have not changed, but the switchover from
548    the old "init.cmi" to the new one is a bit tricky, so I supplied new
549    bootfiles anyway.
550    
551    ----------------------------------------------------------------------
552    Name: Allen Leung
553    Date: 2000/04/02 21:17:00 EST
554    Tag: leunga-20000402-mltree
555    Description:
556    
557       1. Renamed the constructor CALL in MLTREE by popular demand.
558       2. Added a bunch of files from my repository.  These are currently
559          used by other non-SMLNJ backends.
560    
561    ----------------------------------------------------------------------
562    Name: Allen Leung
563    Date: 2000/03/31 21:15:00 EST
564    Tag: leunga-20000331-aliasing
565    Description:
566    
567    This update contains a rewritten (and hopefully more correct) module
568    for extracting aliasing information from CPS.
569    
570       To turn on this feature:
571    
572            Compiler.Control.CG.memDisambiguate := true
573    
574       To pretty print the region information with assembly
575    
576           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
577    
578       To control how many levels of aliasing information are printed, use:
579    
580           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
581    
582       The default of n is 3.
583    
584    ----------------------------------------------------------------------
585    Name: David MacQueen
586    Date: 2000/03/31 11:15:00 EST
587    Tag: dbm-20000331-runtime_fix
588    Description:
589    
590    This update contains:
591    
592    1. runtime/c-lib/c-libraries.c
593       includes added in revision 1.2 caused compilation errors on hppa-hpux
594    
595    2. fix for bug 1556
596       system/Basis/Implementation/NJ/internal-signals.sml
597    
598    ----------------------------------------------------------------------
599    Name: Matthias Blume
600    Date: 2000/03/31 18:00:00 JST
601    Tag: blume_main_v110p26p2_1
602    Description:
603    
604    This update contains:
605    
606    1. A small change to CM's handling of stable libraries:
607       CM now maintains one "global" modmap that is used for all stable
608       libraries.  The use of such a global modmap maximizes sharing and
609       minimizes the need for re-traversing parts of environments during
610       modmap construction.  (However, this has minor impact since modmap
611       construction seems to account for just one percent or less of total
612       compile time.)
613    
614    2. I added a "genmap" phase to the statistics.  This is where I got the
615       "one percent" number (see above).
616    
617    3. CM's new tool parameter mechanism just became _even_ better. :)
618       - The parser understands named parameters and recursive options.
619       - The "make" and "shell" tools use these new features.
620         (This makes it a lot easier to cascade these tools.)
621       - There is a small syntax change: named parameters use a
622    
623           <name> : ( <option> ... )            or
624           <name> : <string>
625    
626         syntax.  Previously, named parameters were implemented in an
627         ad-hoc fashion by each tool individually (by parsing strings)
628         and had the form
629    
630           <name>=<string>
631    
632       See the CM manual for a full description of these issues.
633    
634    ----------------------------------------------------------------------
635    Name: Matthias Blume
636    Date: 2000/03/30 18:00:00 JST
637    Tag: blume_main_v110p26p2_0
638    Description:
639    
640    !!!!! WARNING !!!!!!
641    !!  New binfiles  !!
642    !!!!!!!!!!!!!!!!!!!!
643    
644    This update contains:
645    
646    1. Moderate changes to CM:
647    
648       - Changes to CM's tools mechanism.  In particular, it is now possible
649       to have tools that accept additional "command line" parameters
650       (specified in the .cm file at each instance where the tool's class is
651       used).
652    
653       This was done to accomodate the new "make" and "shell" tools which
654       facilitate fairly seemless hookup to portions of code managed using
655       Makefiles or Shell scripts.
656    
657       There are no classes "shared" or "private" anymore.  Instead, the
658       sharing annotation is now a parameter to the "sml" class.
659    
660       There is a bit of generic machinery for implementing one's own
661       tools that accept command-line parameters.  However, I am not yet fully
662       satisfied with that part, so expect changes here in the future.
663    
664       All existing tools are described in the CM manual.
665    
666       - Slightly better error handling.  (CM now surpresses many followup
667       error messages that tended to be more annoying than helpful.)
668    
669    2. Major changes to the compiler's static environment data structures.
670    
671       - no CMStaticEnv anymore.
672            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
673              but it is called Environment), no conversions between different
674              kinds of static environments
675    
676       - There is still a notion of a "modmap", but such modmaps are generated
677         on demand at the time when they are needed.  This sounds slow, but I
678         sped up the code that generates modmaps enough for this not to lead to
679         a slowdown of the compiler (at least I didn't detect any).
680    
681       - To facilitate rapid modmap generation, static environments now
682         contain an (optional) "modtree" structure.  Modtree annotations are
683         constructed by the unpickler during unpickling.  (This means that
684         the elaborator does not have to worry about modtrees at all.)
685         Modtrees have the advantage that they are compositional in the same
686         way as the environment data structure itself is compositional.
687         As a result, modtrees never hang on to parts of an environment that
688         has already been rendered "stale" by filtering or rebinding.
689    
690       - I went through many, many trials and errors before arriving at the
691         current solution.  (The initial idea of "linkpaths" did not work.)
692         But the result of all this is that I have touched a lot of files that
693         depend on the "modules" and "types" data structures (most of the
694         elaborator). There were a lot of changes during my "linkpath" trials
695         that could have been reverted to their original state but weren't.
696         Please, don't be too harsh on me for messing with this code a bit more
697         than what was strictly necessary...  (I _did_ resist the tempation
698         of doing any "global reformatting" to avoid an untimely death at
699         Dave's hands. :)
700    
701       - One positive aspect of the previous point:  At least I made sure that
702         all files that I touched now compile without warnings (other than
703         "polyEqual").
704    
705       - compiler now tends to run "leaner" (i.e., ties up less memory in
706         redundant modmaps)
707    
708    ----------------------------------------------------------------------
709    Name: Allen Leung
710    Date: 2000/03/29 18:00:00
711    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
712    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
713    Description:
714    
715       This update contains *MAJOR* changes to the way code is generated from CPS
716    in the module mlriscGen, and in various backend modules.
717    
718    CHANGES
719    =======
720    
721    1. MLRiscGen: forward propagation fix.
722    
723       There was a bug in forward propagation introduced at about the same time
724       as the MLRISC x86 backend, which prohibits coalescing to be
725       performed effectively in loops.
726    
727       Effect: speed up of loops in RISC architectures.
728               By itself, this actually slowed down certain benchmarks on the x86.
729    
730    2. MLRiscGen:  forward propagating addresses from consing.
731    
732       I've changed the way consing code is generated.  Basically I separated
733       out the initialization part:
734    
735            store tag,   offset(allocptr)
736            store elem1, offset+4(allocptr)
737            store elem2, offset+8(allocptr)
738            ...
739            store elemn, offset+4n(allocptr)
740    
741       and the address computation part:
742    
743            celladdr <- offset+4+alloctpr
744    
745       and move the address computation part
746    
747       Effect:  register pressure is generally lower as a result.  This
748                makes compilation of certain expressions much faster, such as
749                long lists with non-trivial elements.
750    
751                 [(0,0), (0,0), .... (0,0)]
752    
753    3. MLRiscGen: base pointer elimination.
754    
755        As part of the linkage mechanism, we generate the sequence:
756    
757         L:  ...  <- start of the code fragment
758    
759         L1:
760             base pointer <- linkreg - L1 + L
761    
762         The base pointer was then used for computing relocatable addresses
763       in the code fragment.  Frequently (such as in lots of continuations)
764       this is not needed.  We now eliminate this sequence whenever possible.
765    
766         For compile time efficiency, I'm using a very stupid local heuristic.
767       But in general, this should be done as a control flow analysis.
768    
769       Effect:  Smaller code size.  Speed up of most programs.
770    
771    4. Hppa back end
772    
773         Long jumps in span dependence resolution used to depend on the existence
774      of the base pointer.
775    
776         A jump to a long label L was expanded into the following sequence:
777    
778          LDIL %hi(L-8192), %r29
779          LDO  %lo(L-8192)(%r29), %r29
780          ADD  %r29, baseptr, %r29
781          BV,n %r0(%r29)
782    
783         In the presence of change (3) above, this will not work.  I've changed
784       it so that the following sequence of instructions are generated, which
785       doesn't mention the base pointer at all:
786    
787             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
788        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
789             BV,n  %r0(%r29)          /* Jump */
790    
791    5. Alpha back end
792    
793          New alpha instructions LDB/LDW have been added, as per Fermin's
794       suggestions.   This is unrelated to all other changes.
795    
796    6. X86 back end
797    
798         I've changed andl to testl in the floating point test sequence
799         whenever appropriate.  The Intel optimization guide states that
800         testl is perferable to andl.
801    
802    7. RA (x86 only)
803    
804         I've improved the spill propagation algorithm, using an approximation
805       of maximal weighted independent sets.   This seems to be necessary to
806       alleviate the negative effect in light of the slow down in (1).
807    
808         I'll write down the algorithm one of these days.
809    
810    8. MLRiscGen: frequencies
811    
812         I've added an annotation that states that all call gc blocks have zero
813       execution frequencies.  This improves register allocation on the x86.
814    
815    BENCHMARKS
816    ==========
817    
818       I've only perform the comparison on 110.25.
819    
820       The platforms are:
821    
822        HPPA  A four processor HP machine (E9000) with 5G of memory.
823        X86   A 300Hhz Pentium II with 128M of memory, and
824        SPARC An Ultra sparc 2 with 512M of memory.
825    
826       I used the following parameters for the SML benchmarks:
827    
828                 @SMLalloc
829         HPPA    256k
830         SPARC   512k
831         X86     256k
832    
833    COMPILATION TIME
834    ----------------
835       Here are the numbers comparing the compilation times of the compilers.
836       I've only compared 110.25 compiling the new sources versus
837       a fixpoint version of the new compiler compiling the same.
838    
839                     110.25                                  New
840               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
841         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
842         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
843         X86    999s    315s       94006+130691   987s    296s    108877+141957
844    
845                   110.25         New
846                Code Size      Code Size
847         HPPA   8596736         8561421
848         SPARC  8974299         8785143
849         X86    9029180         8716783
850    
851       So in summary, things are at least as good as before.   Dramatic
852       reduction in compilation is obtained on the Sparc; I can't explain it,
853       but it is reproducible.  Perhaps someone should try to reproduce this
854       on their own machines.
855    
856    SML BENCHMARKS
857    --------------
858    
859        On the average, all benchmarks perform at least as well as before.
860    
861          HPPA         Compilation Time     Spill+Reload      Run Time
862                     110.25  New            110.25    New   110.25  New
863    
864          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
865              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
866       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
867                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
868        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
869             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
870               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
871              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
872         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
873             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
874            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
875      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
876                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
877             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
878                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
879               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
880      --------------------------------------------------------------------------
881       Average                     6.12%                                   4.09%
882    
883          SPARC        Compilation Time     Spill+Reload      Run Time
884                     110.25  New            110.25    New   110.25  New
885    
886          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
887              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
888       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
889                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
890        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
891             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
892               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
893              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
894         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
895             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
896            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
897      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
898                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
899             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
900                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
901               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
902      ----------------------------------------------------------------------------
903       Average                     6.94%                                    2.64%
904    
905          X86          Compilation Time     Spill+Reload      Run Time
906                     110.25  New            110.25    New   110.25  New
907    
908          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
909              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
910       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
911                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
912        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
913             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
914               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
915              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
916         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
917             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
918            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
919      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
920                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
921             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
922                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
923               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
924      ----------------------------------------------------------------------------
925       Average                     1.22%                                     3.36%
926    
927    ----------------------------------------------------------------------
928    Name: Allen Leung
929  Date: 2000/03/23 16:25:00  Date: 2000/03/23 16:25:00
930  Tag: leunga-20000323-fix_x86_alpha  Tag: leunga-20000323-fix_x86_alpha
931  Description:  Description:

Legend:
Removed from v.583  
changed lines
  Added in v.648

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0