Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/READMES/110.27-README
ViewVC logotype

Diff of /sml/trunk/READMES/110.27-README

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 616, Wed Apr 12 15:17:05 2000 UTC revision 617, Tue Apr 18 21:53:41 2000 UTC
# Line 31  Line 31 
31    
32    for further information about these changes.    for further information about these changes.
33    
34    MLRISC, and particularly the x86 back end have been modiefied extensively.    MLRISC, and particularly the x86 back end have been modified extensively.
35    
36    There are a few updates to the SML/NJ Library    There are a few updates to the SML/NJ Library
37    
# Line 77  Line 77 
77  MLRISC:  MLRISC:
78  ======================================================================  ======================================================================
79    
80  Name: Allen Leung  1. Register Allocator
81  Date: 2000/03/10 02:20:00  
82  Tag: leunga-20000310-fix_x86_asm_ra     a. The interface and implementation of the register allocator have been
83  Description:        changed slightly to accommodate the possibility of skipping
84          the register allocation phases completely and go directly to
85          memory allocation.  This is needed for C-- use.
86    
87       b. I've improved the spill propagation algorithm, using an approximation
88          of maximal weighted independent sets.     This affects only the x86
89          platform.
90    
91    2. MLTREE
92    
93       a. Renamed the constructor CALL in MLTREE by popular demand.
94    
95  More assembly output problems involving the indexed addressing mode  3. X86
96    
97       a. More assembly output problems involving the indexed addressing mode
98  on the x86 have been found and corrected. Thanks to Fermin Reig for the  on the x86 have been found and corrected. Thanks to Fermin Reig for the
99  fix.  fix.
100    
101  The interface and implementation of the register allocator have been changed     b.  x86Rewrite bug with MUL3 (found by Lal)
 slightly to accommodate the possibility to skip the register allocation  
 phases completely and go directly to memory allocation.  This is needed  
 for C-- use.  
102    
103  ----------------------------------------------------------------------     c.  Added the instructions FSTS, FSTL
 Name: Allen Leung  
 Date: 2000/03/22 01:23:00  
 Tag: leunga-20000322-fix_x86_hppa_ra  
 Description:  
104    
105  1. X86 fixes/changes     d.  The old code generated for SETcc was completely wrong.
106           The Intel optimization guide is VERY misleading.
107    
108     a.  x86Rewrite bug with MUL3 (found by Lal)     e.  Various fixes related floating point, and extensions.
109     b.  Added the instructions FSTS, FSTL  
110       f.  Things like
111    
112           jmp %eax
113           jmp (%eax)
114    
115           are now output as
116    
117           jmp *%eax
118           jmp *(%eax)
119    
120  2. PA-RISC fixes/changes      g.  Yet another fix for x86 assembly for idivl, imull, mull and friends.
121    
122        h.  I've changed andl to testl in the floating point test sequence
123            whenever appropriate.  The Intel optimization guide states that
124            testl is perferable to andl.
125    
126    4. Alpha
127    
128        a. Some extra patterns related to loads with signed/zero extension
129           provided by Fermin.
130        b.  Added the instructions LDBU, LDWU, STB, STW as per Fermin's suggestion.
131        c.  Added a new mode byteWordLoadStores to the functor parameter to Alpha()
132        d.  Added reassociation code for address computation.
133    
134    5. PA-RISC
135    
136     a.  B label should not be a delay slot candidate!  Why did this work?     a.  B label should not be a delay slot candidate!  Why did this work?
137     b.  ADDT(32, REG(32, r), LI n) now generates one instruction instead of two,     b.  ADDT(32, REG(32, r), LI n) now generates one instruction instead of two,
# Line 110  Line 139 
139     c.  The assembly syntax for fstds and fstdd was wrong.     c.  The assembly syntax for fstds and fstdd was wrong.
140     d.  Added the composite instruction COMICLR/LDO, which is the immediate     d.  Added the composite instruction COMICLR/LDO, which is the immediate
141         operand variant of COMCLR/LDO.         operand variant of COMCLR/LDO.
142       e.  Long jumps in span dependence resolution used to depend on the existence
143           of the base pointer in the SML/NJ runtime.
144    
145            A jump to a long label L was expanded into the following sequence:
146    
147          LDIL %hi(L-8192), %r29
148          LDO  %lo(L-8192)(%r29), %r29
149          ADD  %r29, baseptr, %r29
150          BV,n %r0(%r29)
151    
152  3. Generic MLRISC           I've changed it so that the following sequence of instructions
153           are generated, which doesn't mention the base pointer at all:
154    
155             BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */
156        L':  ADDIL L-(L'+4), %r29     /* Compute address of L */
157             BV,n  %r0(%r29)          /* Jump */
158    
159    6. Generic MLRISC
160    
161     a.  shuffle.sml rewritten to be slightly more efficient     a.  shuffle.sml rewritten to be slightly more efficient
162     b.  DIV bug in mltree-simplify fixed (found by Fermin)     b.  DIV bug in mltree-simplify fixed (found by Fermin)
163    
164  4. Register Allocator  7. Assembly Output
165    
166     a.  I now release the interference graph earlier during spilling.      a.  When generating assemby, resolve the value of client defined constants,
167         May improve memory usage.          instead of generating symbolic values.  This is controlled by the
168            new flag "asm-resolve-constants", which is default to true.
169    
170  ----------------------------------------------------------------------      b.  Added a new flag
 Name: Allen Leung  
 Date: 2000/03/23 16:25:00  
 Tag: leunga-20000323-fix_x86_alpha  
 Description:  
171    
172  1. X86 fixes/changes            "asm-indent-copies" (default to false)
173    
174     a.  The old code generated for SETcc was completely wrong.         When this flag is on, parallel copies will be indented an extra level.
        The Intel optimization guide is VERY misleading.  
175    
 2. ALPHA fixes/changes  
176    
177     a.  Added the instructions LDBU, LDWU, STB, STW as per Fermin's suggestion.  8. Machine Descriptions/Generation
    b.  Added a new mode byteWordLoadStores to the functor parameter to Alpha()  
    c.  Added reassociation code for address computation.  
178    
179  ----------------------------------------------------------------------      a. The precedence parser was slightly broken when parsing infixr symbols.
180  Name: Allen Leung      b. The type generalizing code had the bound variables reversed, resulting
181  Date: 2000/03/29 18:00:00         in a problem during arity raising.
182  Tag: leunga-20000327-mlriscGen_hppa_alpha_x86      c. Various fixes in machine descriptions.
 Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz  
 Description:  
183    
184     This update contains *MAJOR* changes to the way code is generated from CPS  ======================================================================
185  in the module mlriscGen, and in various backend modules.  CPS->MLRISC Code Generation
186    ======================================================================
187    
188  CHANGES     This release contains *MAJOR* changes to the way code is generated from CPS
189  =======     in the module mlriscGen, and in various backend modules.
190    
191  1. MLRiscGen: forward propagation fix.  1. Forward propagation fix.
192    
193     There was a bug in forward propagation introduced at about the same time     There was a bug in forward propagation introduced at about the same time
194     as the MLRISC x86 backend, which prohibits coalescing to be     as the MLRISC x86 backend, which prohibits coalescing to be
# Line 160  Line 197 
197     Effect: speed up of loops in RISC architectures.     Effect: speed up of loops in RISC architectures.
198             By itself, this actually slowed down certain benchmarks on the x86.             By itself, this actually slowed down certain benchmarks on the x86.
199    
200  2. MLRiscGen:  forward propagating addresses from consing.  2. Forward propagating addresses from consing.
201    
202     I've changed the way consing code is generated.  Basically I separated     I've changed the way consing code is generated.  Basically I separated
203     out the initialization part:     out the initialization part:
# Line 183  Line 220 
220    
221               [(0,0), (0,0), .... (0,0)]               [(0,0), (0,0), .... (0,0)]
222    
223  3. MLRiscGen: base pointer elimination.  3. Base pointer elimination.
224    
225      As part of the linkage mechanism, we generate the sequence:      As part of the linkage mechanism, we generate the sequence:
226    
# Line 201  Line 238 
238    
239     Effect:  Smaller code size.  Speed up of most programs.     Effect:  Smaller code size.  Speed up of most programs.
240    
 4. Hppa back end  
   
      Long jumps in span dependence resolution used to depend on the existence  
   of the base pointer.  
   
      A jump to a long label L was expanded into the following sequence:  
   
       LDIL %hi(L-8192), %r29  
       LDO  %lo(L-8192)(%r29), %r29  
       ADD  %r29, baseptr, %r29  
       BV,n %r0(%r29)  
241    
242       In the presence of change (3) above, this will not work.  I've changed  4. Frequency annotations
    it so that the following sequence of instructions are generated, which  
    doesn't mention the base pointer at all:  
   
          BL,n  L', %r29           /* branch and link, L' + 4 -> %r29 */  
     L':  ADDIL L-(L'+4), %r29     /* Compute address of L */  
          BV,n  %r0(%r29)          /* Jump */  
   
 5. Alpha back end  
   
       New alpha instructions LDB/LDW have been added, as per Fermin's  
    suggestions.   This is unrelated to all other changes.  
   
 6. X86 back end  
   
      I've changed andl to testl in the floating point test sequence  
      whenever appropriate.  The Intel optimization guide states that  
      testl is perferable to andl.  
   
 7. RA (x86 only)  
   
      I've improved the spill propagation algorithm, using an approximation  
    of maximal weighted independent sets.   This seems to be necessary to  
    alleviate the negative effect in light of the slow down in (1).  
   
      I'll write down the algorithm one of these days.  
   
 8. MLRiscGen: frequencies  
243    
244       I've added an annotation that states that all call gc blocks have zero       I've added an annotation that states that all call gc blocks have zero
245     execution frequencies.  This improves register allocation on the x86.     execution frequencies.  This improves register allocation on the x86.
# Line 357  Line 356 
356    ----------------------------------------------------------------------------    ----------------------------------------------------------------------------
357     Average                     1.22%                                     3.36%     Average                     1.22%                                     3.36%
358    
 ----------------------------------------------------------------------  
 Name: Allen Leung  
 Date: 2000/03/31 21:15:00 EST  
 Tag: leunga-20000331-aliasing  
 Description:  
359    
360    
361    Aliasing
362    ---------
363  This update contains a rewritten (and hopefully more correct) module  This update contains a rewritten (and hopefully more correct) module
364  for extracting aliasing information from CPS.  for extracting aliasing information from CPS.
365    
# Line 381  Line 378 
378     The default of n is 3.     The default of n is 3.
379    
380  ----------------------------------------------------------------------  ----------------------------------------------------------------------
 Name: Allen Leung  
 Date: 2000/04/02 21:17:00 EST  
 Tag: leunga-20000402-mltree  
 Description:  
   
    1. Renamed the constructor CALL in MLTREE by popular demand.  
    2. Added a bunch of files from my repository.  These are currently  
       used by other non-SMLNJ backends.  
   
 ----------------------------------------------------------------------  
 Name: Allen Leung  
 Date: 2000/04/04 03:18:00 EST  
 Tag: leunga-20000404-C--Moby  
 Description:  
   
     All of these fixes are related to C--, Moby, and my own optimization  
     stuff; so they shouldn't affect SML/NJ.  
   
 1.  X86  
   
     Various fixes related floating point, and extensions.  
   
 2.  Alpha  
   
     Some extra patterns related to loads with signed/zero extension  
     provided by Fermin.  
   
 3.  Assembly  
   
     When generating assemby, resolve the value of client defined constants,  
     instead of generating symbolic values.  This is controlled by the  
     new flag "asm-resolve-constants", which is default to true.  
   
 4.  Machine Descriptions  
   
     a. The precedence parser was slightly broken when parsing infixr symbols.  
     b. The type generalizing code had the bound variables reversed, resulting  
        in a problem during arity raising.  
     c. Various fixes in machine descriptions.  
   
 ----------------------------------------------------------------------  
 Name: Allen Leung  
 Date: 2000/04/04 19:39:00 EST  
 Tag: leunga-20000404-x86-asm  
 Description:  
   
 1.  Fixed a problem in X86 assembly.  
   
     Things like  
   
        jmp %eax  
        jmp (%eax)  
   
     should be output as  
   
        jmp *%eax  
        jmp *(%eax)  
   
 2.  Assembly output  
   
       Added a new flag  
   
           "asm-indent-copies" (default to false)  
   
       When this flag is on, parallel copies will be indented an extra level.  
   
 ----------------------------------------------------------------------  
 Name: Allen Leung  
 Date: 2000/04/06 00:36:00 EST  
 Tag: leunga-20000406-peephole-x86-SSA  
 Description:  
   
 1.  New Peephole code  
   
 2.  Minor improvement to X86 instruction selection  
   
 3.  Various fixes to SSA and machine description -> code translator  
   
 ----------------------------------------------------------------------  
 Name: Allen Leung  
 Date: 2000/04/09 19:09:00 EST  
 Tag: leunga-20000409-misc  
 Description:  
   
 1.  Yet another fix for x86 assembly for idivl, imull, mull and friends.  
   
 2.  Miscellaneous improvements to MLRISC (unused in sml/nj)  
   
381    
382  ======================================================================  ======================================================================
383  CM  CM

Legend:
Removed from v.616  
changed lines
  Added in v.617

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0