Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 570, Wed Mar 8 17:30:13 2000 UTC revision 647, Tue May 16 02:53:02 2000 UTC
# Line 11  Line 11 
11  Date:  Date:
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14    ----------------------------------------------------------------------
15    Name: Allen Leung
16    Date: 2000/05/15 22:53 EDT
17    Tag: leunga-20000515-alpha-x86-ra
18    Description:
19    
20      1. Alpha
21    
22          Slight cleanup.  Removed the instruction SGNXL
23    
24      2. X86
25    
26          Added the following instructions to the instruction set:
27    
28            ROLx, RORx,
29            BTx, BTSx, BTLx, BTRx,
30            XCHGx, and variants with the LOCK prefix
31    
32      3. Register Allocation
33    
34          The module ra-rewrite-with-renaming has been improved.
35    
36      These have no effect on SML/NJ.
37    
38  ----------------------------------------------------------------------  ----------------------------------------------------------------------
39  Name:  Name: Matthias Blume
40  Date:  Date: 2000/05/15 16:20:00 JST
41  Tag:  Tag: blume-20000515-lightrebuild
42    Description:
43    
44    1. I added an alternative to "-rebuild" to "makeml".  The difference is
45       that prior to calling CMB.make' the CM-variable "LIGHT" will be
46       defined.  In effect, the command will not build any cross-compiler
47       backends and therefore finish more quickly.
48    
49       The "fixpt" script also takes a "-light" switch to be able to use
50       this new facility while compiling for a fixpoint.
51    
52    2. I replaced all mentions of anchored paths in group owner specifications
53       with simple relative paths (usually starting with "..").
54       The rationale is that a library's internal workings should not be
55       compromised by the lack of some anchor.  (An anchor is necessary
56       for someone who wants to refer to the library by an anchored path,
57       but it should not be necessary to build the same library in the first
58       place.)
59    
60    3. I changed the way CM's tool mechanism determines the shell command
61       string used for things like ml-yacc etc. so that it does not break
62       when CM.Control.implicit_anchors is turned off.
63    
64    ----------------------------------------------------------------------
65    Name: Matthias Blume
66    Date: 2000/05/12 18:20:00 JST
67    Tag: blume-20000512-ml-build
68    Description:
69    
70    Fixed a bug in config/_ml-build that prevented ml-yacc and ml-lex from
71    getting installed properly (by config/install.sh).
72    
73    ----------------------------------------------------------------------
74    Name: Matthias Blume
75    Date: 2000/05/12 17:30:00 JST
76    Tag: blume-20000512-anchors
77    Description:
78    
79    !!! NEW BOOT FILES !!!
80    
81    This change is in preparation of fading out support for "implicitly
82    anchored path names".  I went through all sources and used the
83    explicit (and relatively new) $-notation.  See system/README and the
84    CM manual for more info on this.
85    
86    I also modified the anchoring scheme for some things such as "smlnj",
87    "MLRISC", "cm", etc. to take advantage of the fact that explicit
88    anchors are more expressive: anchor name and first arc do not have to
89    coincide.  This entails the following user-visible change:
90    
91    You have to write $smlnj/foo/bar instead of smlnj/foo/bar.  In
92    particular, when you fire up sml with a command-line argument, say,
93    e.g.:
94    
95       sml '$smlnj/cmb.cm'
96    
97    At the ML toplevel prompt:
98    
99       CM.autoload "$smlnj/cmb.cm";
100    
101    There is also a new controller in CM.Control that can be used to turn
102    off all remaining support for implicit anchors by saying:
103    
104        CM.autoload "$smlnj/
105        #set CM.Control.implicit_anchors false;
106    
107    This causes CM to reject implicitly anchored paths.  This is (for the
108    time being) less permissive than the "final" version where there will
109    be no more such implicit anchors and relative paths will be just that:
110    relative.
111    
112    The next step (version after next version?) will be to make the
113    default for CM.Control.implicit_anchors false.  After the dust has
114    settled, I can then produce the "final" version of this...
115    
116    Note: Since bootstrapping is a bit tricky, I provided new boot files.
117    
118    ----------------------------------------------------------------------
119    Name: Matthias Blume
120    Date: 2000/05/11 16:30:00 JST
121    Tag: blume-20000511-sources
122    Description:
123    
124    The main change is that I added function CM.sources as a generalized
125    version of the earlier CM.makedepend.  This entails the following
126    additional changes:
127    
128      - CM.makedepend has been dropped.
129    
130      - CM manual has been updated.
131    
132      - TOOLS signature and API have been changed.
133    
134    ----------------------------------------------------------------------
135    Name: Allen Leung
136    Date: 2000/05/10 21:17 EDT
137    Tag: leunga-20000510-moby-c--ssa
138    Description:
139    
140      Various bug fixes and new features for C--, Moby and MLRISC optimizations.
141    None of these affect SML/NJ.
142    
143    1. Register Allocation
144    
145        a. A new ra spilling module (ra/ra-spill-with-renaming) is implemented.
146           This module tries to remove local (i.e. basic block level) redundancies
147           during spilling.
148    
149        b. A new framework for performing region based register allocation.
150           Not yet entirely functional.
151    
152    2. X86
153    
154       a. DefUse for POP was missing the stack pointer [found by Lal]
155       b. Reload for CALL was incorrect in X86Spill [found by John]
156       c. Various fixes in X86Spill so that it can be used correctly for
157          the new spilling module.
158    
159    3. SSA/IR
160    
161       a. New module ir/dj-dataflow.sml implements elimination based
162          data flow analysis.
163    
164    4. MLRiscGen
165    
166       a. Fix for gc type annotation
167    
168    5. MDGen
169    
170       Various fixes for machine description -> ml code translation.  For ssa
171       only.
172    
173    ----------------------------------------------------------------------
174    Name: Allen Leung
175    Date: 2000/05/08 22:17 EDT
176    Tag: leunga-20000508-labexp
177    Description:
178    
179      Fermin has found a few assembly problems with constant expressions
180      generated in LabelExp.  Mostly, the problems involve extra parentheses,
181      which choke on dumb assemblers.  This is his fix.
182    
183    ----------------------------------------------------------------------
184    Name: Dave MacQueen
185    Date: 2000/04/09 14:00 EDT
186    Tag: dbm-20000502-Version_110_28
187    Description:
188    
189    1. Updated src/compiler/TopLevel/main/version.sml to version 110.28
190    
191    2. Updated config/version to 110.28
192    
193    3. Updated config/srcarchiveurl
194    
195    3. New boot files!
196       ftp://ftp.research.bell-labs.com/dist/smlnj/working/110.28/
197    
198    ----------------------------------------------------------------------
199    Name: Matthias Blume
200    Date: 2000/05/01 19:05:00 JST
201    Tag: blume-20000501-noweb
202    Description:
203    
204    A new noweb tool has been added.  The existing system is entirely
205    unaffected by this, but some CM users have asked for renewed noweb
206    support.  Everything is documented in the CM manual.
207    
208    New (plugin) libraries:
209    
210       noweb-tool.cm
211       nw-ext.cm
212    
213    ----------------------------------------------------------------------
214    Name: Dave MacQueen
215    Date: 2000/04/30 12:40PM EDT
216    Tag: dbm-20000430-bug_fixes
217    Description:
218    
219    1. Fix for bug 1498
220       smlnj/src/system/Basis/Implementation/Unsafe/object.sig
221       smlnj/src/system/Basis/Implementation/Unsafe/object.sml
222         added toRealArray function
223       smlnj/src/compiler/MiscUtil/print/ppobj.sml
224         added check for tag Obj.RealArray to array printing case in ppObj
225    
226    2. Fix for bug 1510
227       smlnj/src/compiler/Semant/types/typesutil.sml
228         fixed definition of dummyargs (used by equalTycon) so that
229         dummy args are distinct types
230    
231    ----------------------------------------------------------------------
232    Name: Matthias Blume
233    Date: 2000/04/30 01:00:00 JST
234    Tag: blume-20000430-versions
235    Description:
236    
237    1. CM version numbering added.  This is an implementation of Lal's
238       proposal for adding version numbers and version checking to .cm
239       files.  Lal said that his proposal was just that -- a proposal.
240       For the time being I went ahead and implemented it so that people
241       can comment on it.  Everything is completely backward-compatible
242       (except for the stable library format, i.e., new bootfiles!).
243    
244       As usual, see the CM manual for details.
245    
246    2. An alternative syntax for anchored paths has been implemented.
247       Dave has recently voiced the same concerns that I had when I did
248       this, so there should be some support.  My take is that eventually
249       I will let support for the current syntax (where anchors are
250       "implicit") fade out in favor of the new, explicit syntax.
251       In order to be backward-compatible, both old and new syntax are
252       currently supported.
253    
254       Again, see the CM manual for details.
255    
256    3. Parallel make is trying to be slightly smarter:  When the master
257       process finds a "bottleneck", i.e., when there is only one
258       compilation unit that can be compiled and everybody else is
259       waiting on it, then it will simply compile it directly instead
260       of clumsily telling one of the slaves to do it.
261    
262    4. Support for "unsharing" added.  This is necessary in order to be
263       able to have two different versions of the same library running
264       at the same time (e.g., for trying out a new MLRISC while still
265       having the old MLRISC linked into the current compiler, etc.)
266       See the CM manual.
267    
268    5. Simple "makedepend" functionality added for generating Makefile
269       dependency information.  (This is rather crude at the moment.
270       Expect some changes here in the future.)
271    
272    6. ".fun" added as a recognized suffix for ML files. Also documented
273       explicitly in the manual that the fallback behavior (unknown suffix
274       -> ML file) is not an official feature!
275    
276    7. Small changes to the pickler for stable libraries.
277    
278    8. Several internal changes to CM (for cleanup/improvement).
279    
280    
281    !!!! NEW BINFILES !!!!
282    
283    ----------------------------------------------------------------------
284    Name: Matthias Blume
285    Date: 2000/04/28 17:30:00 JST
286    Tag: blume-20000428-pathconfig
287    Description:
288    
289    1. I changed config/install.sh to remove duplicate entries from the
290       lib/pathconfig file at the end.  Moreover, the final version of
291       lib/pathconfig is sorted alphabetically.  The same (sorting) is done
292       in src/system/installml.
293    
294    2. The config/install.sh script now consistently uses relative
295       pathnames in lib/pathconfig whenever the anchor is in the lib
296       directory.  (So far this was true for the libraries that come
297       pre-compiled and bundled as part of the bootfiles but not for
298       libraries that are compiled by the script itself.)
299    
300    ----------------------------------------------------------------------
301    Name: Matthias Blume
302    Date: 2000/04/26 13:10:00 JST
303    Tag: blume-20000426-fun_suffix
304    Description:
305    
306    Added ".fun" as a recognized file name suffix (for ML code).
307    
308    ----------------------------------------------------------------------
309    Name: Allen Leung
310    Date: 2000/04/25 17:00:00 EST
311    Tag: leunga-20000425-alpha-ra
312    Description:
313    
314    1. Alpha
315    
316        PSEUDOARITH was missing in AlphaRewrite.  This causes an endless loop
317    in C--.
318    
319    2. RA
320    
321       Added a flag "ra-dump-size" to print out the size of the flowgraph
322       and the interference graph.
323    
324    ----------------------------------------------------------------------
325    Name: Dave MacQueen
326    Date: 2000/04/25/
327    Tag: dbm-20000425-mlyacc_doc_examples
328    Description:
329      Updated mlyacc.tex sections 5 and 7 for SML '97 and CM.
330      Updated all three examples in src/ml-yacc/examples to run
331      under 110.* using CM.make.
332    
333    ----------------------------------------------------------------------
334    Name: Allen Leung
335    Date: 2000/04/20 23:04:00 EST
336    Tag: leunga-20000420-ssa-c---stuff
337    Description:
338    
339      This update synchronizes my repository with Yale's.  Most of these
340    changes, however, do not affect SML/NJ at all (the RA is an exception).
341    
342    1. Register Allocator
343    
344       a. An improvement in the interference graph construction:
345          Given a copy
346    
347                s <- t
348    
349          no interference edge between s and t is added for this definition of s.
350    
351       b. I've added two new spill heuristic modules that Fermin and I developed
352          (in the new library RA.cm). These are unused in SML/NJ but maybe
353          useful for others (Moby?)
354    
355    2. X86
356    
357       a. Various fixes in the backend provided by Fermin [C--] and Lal.
358    
359    3. Alpha
360    
361       a. Added the BSR instruction and code generation that goes with it [C--]
362       b. Other fixes too numerous to recount provided by Fermin [C--]
363    
364    4. Regmaps
365    
366       a. The regmaps are not initialized with the identity physical bindings
367          at creation time.  This is unneeded.
368    
369    5. MLRISC Optimizations
370    
371       a. The DJ-Graph module can now compute the iterated dominance frontiers
372          intersects with liveness incrementally in linear time! Woohoo!
373          This is now used in my new SSA construction algorithm.
374    
375       b. THe branch reorganization module is now smarter about linear chains of
376          basic blocks.
377    
378    
379    ----------------------------------------------------------------------
380    Name: Matthias Blume
381    Date: 2000/04/12 13:52:00 JST
382    Tag: blume_main_v110p27_1
383    Description:
384    
385    Changed install.sh script to handle archive files without version number
386    and to use "boot.<arch>-<os>" instead of "sml.boot.<arch>-<os>" for the
387    name of the boot file archive.
388    
389    ----------------------------------------------------------------------
390    Name: Dave MacQueen
391    Date: 2000/04/09 14:00 EDT
392    Tag: dbm-20000410-Version_110_27
393    Description:
394    
395    1. Updated src/compiler/TopLevel/main/version.sml to version 110.27
396    
397    2. Updated src/config/version to 110.27
398    
399    3. New boot files!
400    
401    ----------------------------------------------------------------------
402    Name: Allen Leung
403    Date: 2000/04/09 19:09:00 EST
404    Tag: leunga-20000409-misc
405    Description:
406    
407    1.  Yet another fix for x86 assembly for idivl, imull, mull and friends.
408    
409    2.  Miscellaneous improvements to MLRISC (unused in sml/nj)
410    
411    ----------------------------------------------------------------------
412    Name: Stefan
413    Date: 2000/04/07 10:00:00 EDT
414    Tag: monnier-20000406-branch-handling
415    Description:
416    
417    Improved handling of branches (mostly those generated from
418    polymorphic equality), removed switchoff and changed the
419    default optimization settings (more cpsopt and less flintopt).
420    
421    ----------------------------------------------------------------------
422    Name: Allen Leung
423    Date: 2000/04/06 01:30:00 EST
424    Tag: leunga-20000406-peephole-x86-SSA-2
425    Description:
426    
427       Forgot a few files.
428    
429    ----------------------------------------------------------------------
430    Name: Allen Leung
431    Date: 2000/04/06 00:36:00 EST
432    Tag: leunga-20000406-peephole-x86-SSA
433    Description:
434    
435    1.  New Peephole code
436    
437    2.  Minor improvement to X86 instruction selection
438    
439    3.  Various fixes to SSA and machine description -> code translator
440    
441    ----------------------------------------------------------------------
442    Name: Matthias Blume
443    Date: 2000/04/05 12:30:00 JST
444    Tag: blume_main_v110p26p2_3
445    Description:
446    
447    This update just merges three minor cosmetic updates to CM's sources
448    to get ready for the 110.27 code freeze on Friday.  No functionality
449    has changed.
450    
451    ----------------------------------------------------------------------
452    Name: Allen Leung
453    Date: 2000/04/04 19:39:00 EST
454    Tag: leunga-20000404-x86-asm
455    Description:
456    
457    1.  Fixed a problem in X86 assembly.
458    
459        Things like
460    
461           jmp %eax
462           jmp (%eax)
463    
464        should be output as
465    
466           jmp *%eax
467           jmp *(%eax)
468    
469    2.  Assembly output
470    
471          Added a new flag
472    
473              "asm-indent-copies" (default to false)
474    
475          When this flag is on, parallel copies will be indented an extra level.
476    
477    ----------------------------------------------------------------------
478    Name: Allen Leung
479    Date: 2000/04/04 03:18:00 EST
480    Tag: leunga-20000404-C--Moby
481    Description:
482    
483        All of these fixes are related to C--, Moby, and my own optimization
484        stuff; so they shouldn't affect SML/NJ.
485    
486    1.  X86
487    
488        Various fixes related floating point, and extensions.
489    
490    2.  Alpha
491    
492        Some extra patterns related to loads with signed/zero extension
493        provided by Fermin.
494    
495    3.  Assembly
496    
497        When generating assemby, resolve the value of client defined constants,
498        instead of generating symbolic values.  This is controlled by the
499        new flag "asm-resolve-constants", which is default to true.
500    
501    4.  Machine Descriptions
502    
503        a. The precedence parser was slightly broken when parsing infixr symbols.
504        b. The type generalizing code had the bound variables reversed, resulting
505           in a problem during arity raising.
506        c. Various fixes in machine descriptions.
507    
508    ----------------------------------------------------------------------
509    Name: Matthias Blume
510    Date: 2000/04/03 16:05:00 JST
511    Tag: blume_main_v110p26p2_2
512    Description:
513    
514    I eliminated coreEnv from compInfo.  Access to the "Core" structure is
515    now done via the ordinary static environment that is context to each
516    compilation unit.
517    
518    To this end, I arranged that instead of "structure Core" as "structure
519    _Core" is bound in the pervasive environment.  Core access is done via
520    _Core (which can never be accidentially rebound because _Core is not a
521    legal surface-syntax symbol).
522    
523    The current solution is much cleaner because the core environment is
524    now simply part of the pervasive environment which is part of every
525    compilation unit's context anyway.  In particular, this eliminates all
526    special-case handling that was necessary until now in order to deal
527    with dynamic and symbolic parts of the core environment.
528    
529    Remaining hackery (to bind the "magic" symbol _Core) is localized in the
530    compilation mananger's bootstrap compiler (actually: in the "init group"
531    handling).  See the comments in src/system/smlnj/init/init.cmi for
532    more details.
533    
534    I also tried to track down all mentions of "Core" (as string argument
535    to Symbol.strSymbol) in the compiler and replaced them with a
536    reference to the new CoreSym.coreSym.  Seems cleaner since the actual
537    name appears in one place only.
538    
539    Binfile and bootfile format have not changed, but the switchover from
540    the old "init.cmi" to the new one is a bit tricky, so I supplied new
541    bootfiles anyway.
542    
543    ----------------------------------------------------------------------
544    Name: Allen Leung
545    Date: 2000/04/02 21:17:00 EST
546    Tag: leunga-20000402-mltree
547    Description:
548    
549       1. Renamed the constructor CALL in MLTREE by popular demand.
550       2. Added a bunch of files from my repository.  These are currently
551          used by other non-SMLNJ backends.
552    
553    ----------------------------------------------------------------------
554    Name: Allen Leung
555    Date: 2000/03/31 21:15:00 EST
556    Tag: leunga-20000331-aliasing
557    Description:
558    
559    This update contains a rewritten (and hopefully more correct) module
560    for extracting aliasing information from CPS.
561    
562       To turn on this feature:
563    
564            Compiler.Control.CG.memDisambiguate := true
565    
566       To pretty print the region information with assembly
567    
568           Compiler.Control.MLRISC.getFlag "asm-show-region" := true;
569    
570       To control how many levels of aliasing information are printed, use:
571    
572           Compiler.Control.MLRISC.getInt "points-to-show-level" := n
573    
574       The default of n is 3.
575    
576    ----------------------------------------------------------------------
577    Name: David MacQueen
578    Date: 2000/03/31 11:15:00 EST
579    Tag: dbm-20000331-runtime_fix
580    Description:
581    
582    This update contains:
583    
584    1. runtime/c-lib/c-libraries.c
585       includes added in revision 1.2 caused compilation errors on hppa-hpux
586    
587    2. fix for bug 1556
588       system/Basis/Implementation/NJ/internal-signals.sml
589    
590    ----------------------------------------------------------------------
591    Name: Matthias Blume
592    Date: 2000/03/31 18:00:00 JST
593    Tag: blume_main_v110p26p2_1
594    Description:
595    
596    This update contains:
597    
598    1. A small change to CM's handling of stable libraries:
599       CM now maintains one "global" modmap that is used for all stable
600       libraries.  The use of such a global modmap maximizes sharing and
601       minimizes the need for re-traversing parts of environments during
602       modmap construction.  (However, this has minor impact since modmap
603       construction seems to account for just one percent or less of total
604       compile time.)
605    
606    2. I added a "genmap" phase to the statistics.  This is where I got the
607       "one percent" number (see above).
608    
609    3. CM's new tool parameter mechanism just became _even_ better. :)
610       - The parser understands named parameters and recursive options.
611       - The "make" and "shell" tools use these new features.
612         (This makes it a lot easier to cascade these tools.)
613       - There is a small syntax change: named parameters use a
614    
615           <name> : ( <option> ... )            or
616           <name> : <string>
617    
618         syntax.  Previously, named parameters were implemented in an
619         ad-hoc fashion by each tool individually (by parsing strings)
620         and had the form
621    
622           <name>=<string>
623    
624       See the CM manual for a full description of these issues.
625    
626    ----------------------------------------------------------------------
627    Name: Matthias Blume
628    Date: 2000/03/30 18:00:00 JST
629    Tag: blume_main_v110p26p2_0
630    Description:
631    
632    !!!!! WARNING !!!!!!
633    !!  New binfiles  !!
634    !!!!!!!!!!!!!!!!!!!!
635    
636    This update contains:
637    
638    1. Moderate changes to CM:
639    
640       - Changes to CM's tools mechanism.  In particular, it is now possible
641       to have tools that accept additional "command line" parameters
642       (specified in the .cm file at each instance where the tool's class is
643       used).
644    
645       This was done to accomodate the new "make" and "shell" tools which
646       facilitate fairly seemless hookup to portions of code managed using
647       Makefiles or Shell scripts.
648    
649       There are no classes "shared" or "private" anymore.  Instead, the
650       sharing annotation is now a parameter to the "sml" class.
651    
652       There is a bit of generic machinery for implementing one's own
653       tools that accept command-line parameters.  However, I am not yet fully
654       satisfied with that part, so expect changes here in the future.
655    
656       All existing tools are described in the CM manual.
657    
658       - Slightly better error handling.  (CM now surpresses many followup
659       error messages that tended to be more annoying than helpful.)
660    
661    2. Major changes to the compiler's static environment data structures.
662    
663       - no CMStaticEnv anymore.
664            - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment,
665              but it is called Environment), no conversions between different
666              kinds of static environments
667    
668       - There is still a notion of a "modmap", but such modmaps are generated
669         on demand at the time when they are needed.  This sounds slow, but I
670         sped up the code that generates modmaps enough for this not to lead to
671         a slowdown of the compiler (at least I didn't detect any).
672    
673       - To facilitate rapid modmap generation, static environments now
674         contain an (optional) "modtree" structure.  Modtree annotations are
675         constructed by the unpickler during unpickling.  (This means that
676         the elaborator does not have to worry about modtrees at all.)
677         Modtrees have the advantage that they are compositional in the same
678         way as the environment data structure itself is compositional.
679         As a result, modtrees never hang on to parts of an environment that
680         has already been rendered "stale" by filtering or rebinding.
681    
682       - I went through many, many trials and errors before arriving at the
683         current solution.  (The initial idea of "linkpaths" did not work.)
684         But the result of all this is that I have touched a lot of files that
685         depend on the "modules" and "types" data structures (most of the
686         elaborator). There were a lot of changes during my "linkpath" trials
687         that could have been reverted to their original state but weren't.
688         Please, don't be too harsh on me for messing with this code a bit more
689         than what was strictly necessary...  (I _did_ resist the tempation
690         of doing any "global reformatting" to avoid an untimely death at
691         Dave's hands. :)
692    
693       - One positive aspect of the previous point:  At least I made sure that
694         all files that I touched now compile without warnings (other than
695         "polyEqual").
696    
697       - compiler now tends to run "leaner" (i.e., ties up less memory in
698         redundant modmaps)
699    
700    ----------------------------------------------------------------------
701    Name: Allen Leung
702    Date: 2000/03/29 18:00:00
703    Tag: leunga-20000327-mlriscGen_hppa_alpha_x86
704    Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz
705    Description:
706    
707       This update contains *MAJOR* changes to the way code is generated from CPS
708    in the module mlriscGen, and in various backend modules.
709    
710    CHANGES
711    =======
712    
713    1. MLRiscGen: forward propagation fix.
714    
715       There was a bug in forward propagation introduced at about the same time
716       as the MLRISC x86 backend, which prohibits coalescing to be
717       performed effectively in loops.
718    
719       Effect: speed up of loops in RISC architectures.
720               By itself, this actually slowed down certain benchmarks on the x86.
721    
722    2. MLRiscGen:  forward propagating addresses from consing.
723    
724       I've changed the way consing code is generated.  Basically I separated
725       out the initialization part:
726    
727            store tag,   offset(allocptr)
728            store elem1, offset+4(allocptr)
729            store elem2, offset+8(allocptr)
730            ...
731            store elemn, offset+4n(allocptr)
732    
733       and the address computation part:
734    
735            celladdr <- offset+4+alloctpr
736    
737       and move the address computation part
738    
739       Effect:  register pressure is generally lower as a result.  This
740                makes compilation of certain expressions much faster, such as
741                long lists with non-trivial elements.
742    
743                 [(0,0), (0,0), .... (0,0)]
744    
745    3. MLRiscGen: base pointer elimination.
746    
747        As part of the linkage mechanism, we generate the sequence:
748    
749         L:  ...  <- start of the code fragment
750    
751         L1:
752             base pointer <- linkreg - L1 + L
753    
754         The base pointer was then used for computing relocatable addresses
755       in the code fragment.  Frequently (such as in lots of continuations)
756       this is not needed.  We now eliminate this sequence whenever possible.
757    
758         For compile time efficiency, I'm using a very stupid local heuristic.
759       But in general, this should be done as a control flow analysis.
760    
761       Effect:  Smaller code size.  Speed up of most programs.
762    
763    4. Hppa back end
764    
765         Long jumps in span dependence resolution used to depend on the existence
766      of the base pointer.
767    
768         A jump to a long label L was expanded into the following sequence:
769    
770          LDIL %hi(L-8192), %r29
771          LDO  %lo(L-8192)(%r29), %r29
772          ADD  %r29, baseptr, %r29
773          BV,n %r0(%r29)
774    
775         In the presence of change (3) above, this will not work.  I've changed
776       it so that the following sequence of instructions are generated, which
777       doesn't mention the base pointer at all:
778    
779             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
780        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
781             BV,n  %r0(%r29)          /* Jump */
782    
783    5. Alpha back end
784    
785          New alpha instructions LDB/LDW have been added, as per Fermin's
786       suggestions.   This is unrelated to all other changes.
787    
788    6. X86 back end
789    
790         I've changed andl to testl in the floating point test sequence
791         whenever appropriate.  The Intel optimization guide states that
792         testl is perferable to andl.
793    
794    7. RA (x86 only)
795    
796         I've improved the spill propagation algorithm, using an approximation
797       of maximal weighted independent sets.   This seems to be necessary to
798       alleviate the negative effect in light of the slow down in (1).
799    
800         I'll write down the algorithm one of these days.
801    
802    8. MLRiscGen: frequencies
803    
804         I've added an annotation that states that all call gc blocks have zero
805       execution frequencies.  This improves register allocation on the x86.
806    
807    BENCHMARKS
808    ==========
809    
810       I've only perform the comparison on 110.25.
811    
812       The platforms are:
813    
814        HPPA  A four processor HP machine (E9000) with 5G of memory.
815        X86   A 300Hhz Pentium II with 128M of memory, and
816        SPARC An Ultra sparc 2 with 512M of memory.
817    
818       I used the following parameters for the SML benchmarks:
819    
820                 @SMLalloc
821         HPPA    256k
822         SPARC   512k
823         X86     256k
824    
825    COMPILATION TIME
826    ----------------
827       Here are the numbers comparing the compilation times of the compilers.
828       I've only compared 110.25 compiling the new sources versus
829       a fixpoint version of the new compiler compiling the same.
830    
831                     110.25                                  New
832               Total  Time in RA  Spill+Reload   Total  Time In RA Spill+Reload
833         HPPA   627s    116s        2684+3584     599s    95s       1003+1879
834         SPARC  892s    173s        2891+3870     708s    116s      1004+1880
835         X86    999s    315s       94006+130691   987s    296s    108877+141957
836    
837                   110.25         New
838                Code Size      Code Size
839         HPPA   8596736         8561421
840         SPARC  8974299         8785143
841         X86    9029180         8716783
842    
843       So in summary, things are at least as good as before.   Dramatic
844       reduction in compilation is obtained on the Sparc; I can't explain it,
845       but it is reproducible.  Perhaps someone should try to reproduce this
846       on their own machines.
847    
848    SML BENCHMARKS
849    --------------
850    
851        On the average, all benchmarks perform at least as well as before.
852    
853          HPPA         Compilation Time     Spill+Reload      Run Time
854                     110.25  New            110.25    New   110.25  New
855    
856          barnesHut  3.158  3.015  4.75%    1+1       0+0   2.980  2.922   2.00%
857              boyer  6.152  5.708  7.77%    0+0       0+0   0.218  0.213   2.34%
858       count-graphs  1.168  1.120  4.32%    0+0       0+0  22.705 23.073  -1.60%
859                fft  0.877  0.792 10.74%    1+3       1+3   0.602  0.587   2.56%
860        knuthBendix  3.180  2.857 11.32%    0+0       0+0   0.675  0.662   2.02%
861             lexgen  6.190  5.290 17.01%    0+0       0+0   0.913  0.788  15.86%
862               life  0.803  0.703 14.22%   25+25      0+0   0.153  0.140   9.52%
863              logic  2.048  2.007  2.08%    6+6       1+1   4.133  4.008   3.12%
864         mandelbrot  0.077  0.080 -4.17%    0+0       0+0   0.765  0.712   7.49%
865             mlyacc 22.932 20.937  9.53%  154+181    32+57  0.468  0.430   8.91%
866            nucleic  5.183  5.060  2.44%    2+2       0+0   0.125  0.120   4.17%
867      ratio-regions  3.357  3.142  6.84%    0+0       0+0  116.225 113.173 2.70%
868                ray  1.283  1.290 -0.52%    0+0       0+0   2.887  2.855   1.11%
869             simple  6.307  6.032  4.56%   28+30      5+7   3.705  3.658   1.28%
870                tsp  0.888  0.862  3.09%    0+0       0+0   7.040  6.893   2.13%
871               vliw 24.378 23.455  3.94%  106+127    25+45  2.758  2.707   1.91%
872      --------------------------------------------------------------------------
873       Average                     6.12%                                   4.09%
874    
875          SPARC        Compilation Time     Spill+Reload      Run Time
876                     110.25  New            110.25    New   110.25  New
877    
878          barnesHut  3.778  3.592  5.20%    2+2       0+0   3.648  3.453    5.65%
879              boyer  6.632  6.110  8.54%    0+0       0+0   0.258  0.242    6.90%
880       count-graphs  1.435  1.325  8.30%    0+0       0+0  33.672 34.737   -3.07%
881                fft  0.980  0.940  4.26%    3+9       2+6   0.838  0.827    1.41%
882        knuthBendix  3.590  3.138 14.39%    0+0       0+0   0.962  0.967   -0.52%
883             lexgen  6.593  6.072  8.59%    1+1       0+0   1.077  1.078   -0.15%
884               life  0.972  0.868 11.90%   26+26      0+0   0.143  0.140    2.38%
885              logic  2.525  2.387  5.80%    7+7       1+1   5.625  5.158    9.05%
886         mandelbrot  0.090  0.093 -3.57%    0+0       0+0   0.855  0.728   17.39%
887             mlyacc 26.732 23.827 12.19%  162+189    32+57  0.550  0.560   -1.79%
888            nucleic  6.233  6.197  0.59%    3+3       0+0   0.163  0.173   -5.77%
889      ratio-regions  3.780  3.507  7.79%    0+0       0+0 133.993 131.035   2.26%
890                ray  1.595  1.550  2.90%    1+1       0+0   3.440  3.418    0.63%
891             simple  6.972  6.487  7.48%   29+32      5+7   3.523  3.525   -0.05%
892                tsp  1.115  1.063  4.86%    0+0       0+0   7.393  7.265    1.77%
893               vliw 27.765 24.818 11.87%  110+135    25+45  2.265  2.135    6.09%
894      ----------------------------------------------------------------------------
895       Average                     6.94%                                    2.64%
896    
897          X86          Compilation Time     Spill+Reload      Run Time
898                     110.25  New            110.25    New   110.25  New
899    
900          barnesHut  5.530  5.420  2.03%  593+893   597+915   3.532  3.440   2.66%
901              boyer  8.768  7.747 13.19%  493+199   301+289   0.327  0.297  10.11%
902       count-graphs  2.040  2.010  1.49%  298+394   315+457  26.578 28.660  -7.26%
903                fft  1.327  1.302  1.92%  112+209   115+210   1.055  0.962   9.71%
904        knuthBendix  5.218  5.475 -4.69%  451+598   510+650   0.928  0.932  -0.36%
905             lexgen  9.970  9.623  3.60% 1014+841  1157+885   0.947  0.928   1.97%
906               life  1.183  1.183  0.00%  162+182   145+148   0.127  0.103  22.58%
907              logic  3.285  3.512 -6.45%  514+684   591+836   5.682  5.577   1.88%
908         mandelbrot  0.147  0.143  2.33%   38+41     33+54    0.703  0.690   1.93%
909             mlyacc 35.457 32.763  8.22% 3496+4564 3611+4860  0.552  0.550   0.30%
910            nucleic  7.100  6.888  3.07%  239+168   201+158   0.175  0.173   0.96%
911      ratio-regions  6.388  6.843 -6.65% 1182+257   981+300  120.142 120.345 -0.17%
912                ray  2.332  2.338 -0.29%  346+398   402+494   3.593  3.540   1.51%
913             simple  9.912  9.903  0.08% 1475+941  1579+1168  3.057  3.178  -3.83%
914                tsp  1.623  1.532  5.98%  266+200   250+211   8.045  7.878   2.12%
915               vliw 33.947 35.470 -4.29% 2629+2774 2877+3171  2.072  1.890   9.61%
916      ----------------------------------------------------------------------------
917       Average                     1.22%                                     3.36%
918    
919    ----------------------------------------------------------------------
920    Name: Allen Leung
921    Date: 2000/03/23 16:25:00
922    Tag: leunga-20000323-fix_x86_alpha
923    Description:
924    
925    1. X86 fixes/changes
926    
927       a.  The old code generated for SETcc was completely wrong.
928           The Intel optimization guide is VERY misleading.
929    
930    2. ALPHA fixes/changes
931    
932       a.  Added the instructions LDBU, LDWU, STB, STW as per Fermin's suggestion.
933       b.  Added a new mode byteWordLoadStores to the functor parameter to Alpha()
934       c.  Added reassociation code for address computation.
935    
936    ----------------------------------------------------------------------
937    Name: Allen Leung
938    Date: 2000/03/22 01:23:00
939    Tag: leunga-20000322-fix_x86_hppa_ra
940    Description:
941    
942    1. X86 fixes/changes
943    
944       a.  x86Rewrite bug with MUL3 (found by Lal)
945       b.  Added the instructions FSTS, FSTL
946    
947    2. PA-RISC fixes/changes
948    
949       a.  B label should not be a delay slot candidate!  Why did this work?
950       b.  ADDT(32, REG(32, r), LI n) now generates one instruction instead of two,
951           as it should be.
952       c.  The assembly syntax for fstds and fstdd was wrong.
953       d.  Added the composite instruction COMICLR/LDO, which is the immediate
954           operand variant of COMCLR/LDO.
955    
956    3. Generic MLRISC
957    
958       a.  shuffle.sml rewritten to be slightly more efficient
959       b.  DIV bug in mltree-simplify fixed (found by Fermin)
960    
961    4. Register Allocator
962    
963       a.  I now release the interference graph earlier during spilling.
964           May improve memory usage.
965    
966    ----------------------------------------------------------------------
967    Name: Matthias Blume
968    Date: 2000/03/14 14:15:32
969    Tag: blume_main_v110p26p1_2
970    Description:
971    
972    1. Tools.registerStdShellCmdTool (from smlnj/cm/tool.cm) takes an
973    additional argument called "template" which is an optional string that
974    specifiel the layout of the tool command line.  See the CM manual for
975    explanation.
976    
977    2. A special-purpose tool can be "regisitered" by simply dropping the
978    corresponding <...>-tool.cm (and/or <...>-ext.cm) into the same
979    directory where the .cm file lives that uses this tool.  (The
980    behavior/misfeature until now was to look for the tool description
981    files in the current working directory.)  As before, tool description
982    files could also be anchored -- in which case they can live anywhere
983    they like.  Following the recent e-mail discussion, this change should
984    make it easier to have special-purpose tools that are shipped together
985    with the sources of the program that uses them.
986    
987    ----------------------------------------------------------------------
988    Name: Matthias Blume
989    Date: 2000/03/10 07:48:34
990    Tag: blume_main_v110p26p1_1
991    Description:
992    
993    I added a re-written version of Dave's fixpt script to src/system.
994    Changes relative to the original version:
995      - sh-ified (not everybody has ksh)
996      - automatically figures out which architecture it runs on
997      - uses ./makeml a bit more cleverly
998      - never invokes ./installml (and, thus, does not clobber your
999        good and working installation of sml in case something goes wrong)
1000      - accepts max iteration count using option "-iter <n>"
1001      - accepts a "base" name using option "-base <base>"
1002    
1003    It does not build any extraneous heap images but directly rebuilds
1004    bin- and boot-hierarchies using makeml's "-rebuild" switch. Finally,
1005    it can incorporate existing bin- and boot- hierarchies.  For example,
1006    suppose the base is set to "sml" (which is the default).  Then it
1007    successively builds
1008    
1009            sml.bin.<arch>-unix and sml.boot.<arch>-unix
1010    then    sml1.bin.<arch>-unix and sml1.boot.<arch>-unix
1011    then    sml2.bin.<arch>-unix and sml2.boot.<arch>-unix
1012    ...
1013    then    sml<n>.bin.<arch>-unix and sml<n>.boot.<arch>-unix
1014    
1015    and so on.  If any of these already exist, it will just use what's
1016    there.  In particular, many people will have the initial set of bin
1017    and boot files around, so this saves time for at least one full
1018    rebuild.  Having sets of the form <base><k>.{bin,boot}.<arch>-unix for
1019    <k>=1,2,... is normally not a good idea when invoking fixpt.  However,
1020    they might be the result of an earlier partial run of fixpt (which
1021    perhaps got accidentially killed).  In this case, fixpt will quickly
1022    move through what exists before continuing where it left off earlier,
1023    and, thus, saves a lot of time.
1024    
1025    ----------------------------------------------------------------------
1026    Name: Allen Leung
1027    Date: 00/03/10 02:20:00
1028    Tag: leunga-20000310-fix_x86_asm_ra
1029    Description:
1030    
1031    More assembly output problems involving the indexed addressing mode
1032    on the x86 have been found and corrected. Thanks to Fermin Reig for the
1033    fix.
1034    
1035    The interface and implementation of the register allocator have been changed
1036    slightly to accommodate the possibility to skip the register allocation
1037    phases completely and go directly to memory allocation.  This is needed
1038    for C-- use.
1039    
1040    ----------------------------------------------------------------------
1041    Name: Matthias Blume
1042    Date: 00/03/09 10:23:53
1043    Tag: blume_main_v110p26p1_0
1044    Description:
1045    
1046    * Complete re-organization of library names.  Many libraries have been
1047    consolidated so that they share the same path anchor.  For example,
1048    all MLRISC-related libraries are anchored at MLRISC, most libraries that
1049    are SML/NJ-specific are under "smlnj".  Notice that names like
1050    host-cmb.cm or host-compiler.cm no longer exist.  See system/README
1051    for a complete description of the new naming scheme.  Quick reference:
1052    
1053       host-cmb.cm        -> smlnj/cmb.cm
1054       host-compiler.cm   -> smlnj/compiler.cm
1055       full-cm.cm         -> smlnj/cm.cm
1056       <arch>-<os>.cm     -> smlnj/cmb/<arch>-<os>.cm
1057       <arch>-compiler.cm -> smlnj/compiler/<arch>.cm
1058    
1059    * Bug fixes in CM.
1060        - exceptions in user code are being passed through (i.e., reach top level)
1061        - more bugs in paranoia mode fixed
1062        - bug related to checking group owners fixed
1063    
1064    * New install.sh script that automagically fetches archive files:
1065      The new file config/srcarchiveurl must contain the URL of the
1066      (remote) directory that contains bin files (or other source archives).
1067      If install.sh does not find the archive locally, it tries to get
1068      it from that remote directory.
1069      This should simplify installation further:  For machines that have
1070      access to the internet, just fetch <version>-config.tgz, unpack it,
1071      edit config/targets, and go (run config/install.sh).  The scipt will
1072      fetch everything else that it might need all by itself.
1073    
1074      For CVS users, this mechanism is not relevant for source archives, but
1075      it is convenient for getting new sets of binfiles.
1076    
1077      Archives should be tar files compressed with either gzip, compress, or
1078      bzip2.  The script recognizes .tgz, .tar, tar.gz, tz, .tar.Z, and .tar.bz2.
1079    
1080    ----------------------------------------------------------------------
1081    Name: Matthias Blume
1082    Date: 2000/03/07 04:01:04
1083    Tag: blume_main_v110_26_2
1084  Description:  Description:
1085    - size info in BOOTLIST
1086         * no fixed upper limits for number of bootfiles or length of
1087           bootfile names in runtime
1088         * falling back to old behavior if no BOOTLIST size info found
1089    - allocation size heuristics in .run-sml
1090         * tries to read cache size from /proc/cpuinfo (this is important for
1091            small-cache Celeron systems!)
1092    - install.sh robustified
1093    - CM manual updates
1094    - paranoid mode
1095         * no more CMB.deliver() (i.e., all done by CMB.make())
1096         * can re-use existing sml.boot.* files
1097         * init.cmi now treated as library
1098         * library stamps for consistency checks
1099    - sml.boot.<arch>-<os>/PIDMAP file
1100         * This file is read by the CM startup code.  This is used to minimize
1101           the amount of dynamic state that needs to be stowed away for the
1102           purpose of sharing between interactive system and user code.
1103    - CM.Anchor.anchor instead of CM.Anchor.{set,cancel}
1104         * Upon request by Elsa.  Anchors now controlled by get-set-pair
1105           like most other CM state variables.
1106    - Compiler.CMSA eliminated
1107         * No longer supported by CM anyway.
1108    - fixed bugs in pickler that kept biting Stefan
1109         * past refs to past refs (was caused by the possibility that
1110           ad-hoc sharing is more discriminating than hash-cons sharing)
1111         * integer overflow on LargeInt.minInt
1112    - ml-{lex,yacc} build scripts now use new mechanism
1113      for building standalone programs
1114    - fixed several gcc -Wall warnings that were caused by missing header
1115      files, missing initializations, etc., in runtime (not all warnings
1116      eliminated, though)

Legend:
Removed from v.570  
changed lines
  Added in v.647

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0