revision 1116, Tue Mar 5 23:17:18 2002 UTC revision 1137, Tue Mar 12 22:28:55 2002 UTC
# Line 11  Line 11 
11  Date: yyyy/mm/dd  Date: yyyy/mm/dd
12  Tag: <post-commit CVS tag>  Tag: <post-commit CVS tag>
13  Description:  Description:
14    ----------------------------------------------------------------------
15    Name: Matthias Blume
16    Date: 2002/03/12 17:30:00 EST
17    Tag: blume-20020312-version110392
18    Description:
20    Yet another version number bump (because of small changes to the
21    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
23    Changes:
25      The new pid generation scheme described a few weeks ago was overly
26      complicated.  I implemented a new mechanism that is simpler and
27      provides a bit more "stability":  Once CM has seen a compilation
28      unit, it keeps its identity constant (as long as you do not delete
29      those crucial CM/GUID/* files).  This means that when you change
30      and interface, compiler, then go back to the old interface, and
31      compile again, you arrive at the original pid.
33      There now also is a mechanism that instructs CM to use the plain
34      environment hash as a module's pid (effectively making its GUID
35      the empty string).  For this, "noguid" must be specified as an
36      option to the .sml file in question within its .cm file.
37      This is most useful for code that is being generated by tools such
38      as ml-nlffigen (because during development programmers tend to
39      erase the tool's entire output directory tree including CM's cached
40      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
41      revert to the old, broken behavior of SML/NJ, but in specific cases
42      where there is no danger of interface confusion, its use is ok
43      (I think).
45      ml-nlffigen by default generates "noguid" annotations.  They can be
46      turned off by specifying -guid in its command line.
48    ----------------------------------------------------------------------
49    Name: Lal George
50    Date: 2002/03/12 12 14:42:36 EST
51    Tag: george-20020312-frequency-computation
52    Description:
54    Integrated jump chaining and static block frequency into the
55    compiler. More details and numbers later.
57    ----------------------------------------------------------------------
58    Name: Lal George
59    Date: 2002/03/11 11 22:38:53 EST
60    Tag: george-20020311-jump-chain-elim
61    Description:
63    Tested the jump chain elimination on all architectures (except the
64    hppa).  This is on by default right now and is profitable for the
65    alpha and x86, however, it may not be profitable for the sparc and ppc
66    when compiling the compiler.
68    The gc test will typically jump to a label at the end of the cluster,
69    where there is another jump to an external cluster containing the actual
70    code to invoke gc. This is to allow factoring of common gc invocation
71    sequences. That is to say, we generate:
73            f:
74               testgc
75               ja   L1      % jump if above to L1
77            L1:
78               jmp L2
81    After jump chain elimination the 'ja L1' instructions is converted to
82    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
83    up being implemented in their long form (if L2 is far away) using:
85            jbe     L3      % jump if below or equal to L3
86            jmp     L2
87         L3:
88            ...
91    For large compilation units L2  may be far away.
94    ----------------------------------------------------------------------
95    Name: Matthias Blume
96    Date: 2002/03/11 13:30:00 EST
97    Tag: blume-20020311-mltreeeval
98    Description:
100    A functor parameter was missing.
102    ----------------------------------------------------------------------
103    Name: Allen Leung
104    Date: 2002/03/11 10:30:00 EST
105    Tag: leunga-20020310-runtime-string0
106    Description:
108       The representation of the empty string now points to a
109    legal null terminated C string instead of unit.  It is now possible
110    to convert an ML string into C string with InlineT.CharVector.getData.
111    This compiles into one single machine instruction.
113    ----------------------------------------------------------------------
114    Name: Allen Leung
115    Date: 2002/03/10 23:55:00 EST
116    Tag: leunga-20020310-x86-call
117    Description:
119       Added machine generation for CALL instruction (relative displacement mode)
121    ----------------------------------------------------------------------
122    Name: Matthias Blume
123    Date: 2002/03/08 16:05:00
124    Tag: blume-20020308-entrypoints
125    Description:
127    Version number bumped to 110.39.1.  NEW BOOTFILES!
129    Entrypoints: non-zero offset into a code object where execution should begin.
131    - Added the notion of an entrypoint to CodeObj.
132    - Added reading/writing of entrypoint info to Binfile.
133    - Made runtime system bootloader aware of entrypoints.
134    - Use the address of the label of the first function given to mlriscGen
135      as the entrypoint.  This address is currently always 0, but it will
136      not be 0 once we turn on block placement.
137    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
138      with entry points) from mlriscGen.
140    ----------------------------------------------------------------------
141    Name: Allen Leung
142    Date: 2002/03/07 20:45:00 EST
143    Tag: leunga-20020307-x86-cmov
144    Description:
146       Bug fixes for CMOVcc on x86.
148       1. Added machine code generation for CMOVcc
149       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
150       3. CMOVcc cannot have an immediate operand as argument.
152    ----------------------------------------------------------------------
153    Name: Matthias Blume
154    Date: 2002/03/07 16:15:00 EST
155    Tag: blume-20020307-controls
156    Description:
158    This is a very large but mostly boring patch which makes (almost)
159    every tuneable compiler knob (i.e., pretty much everything under
160    Control.* plus a few other things) configurable via both the command
161    line and environment variables in the style CM did its configuration
162    until now.
164    Try starting sml with '-h' (or, if you are brave, '-H')
166    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
167    implements the underlying generic mechanism.
169    The interface to some of the existing such facilities has changed somewhat.
170    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
171    (The getFoo interface is still there for backward-compatibility, but its
172    use is deprecated.)
174    The ml-build script passes -Cxxx=yyy command-line arguments through so
175    that one can now twiddle the compiler settings when using this "batch"
176    compiler.
178    TODO items:
180    We should go through and throw out all controls that are no longer
181    connected to anything.  Moreover, we should go through and provide
182    meaningful (and correct!) documentation strings for those controls
183    that still are connected.
185    Currently, multiple calls to Controls.new are accepted (only the first
186    has any effect).  Eventually we should make sure that every control
187    is being made (via Controls.new) exactly once.  Future access can then
188    be done using Controls.acc.
190    Finally, it would probably be a good idea to use the getter-setter
191    interface to controls rather than ref cells.  For the time being, both
192    styles are provided by the Controls module, but getter-setter pairs are
193    better if thread-safety is of any concern because they can be wrapped.
195    *****************************************
197    One bug fix: The function blockPlacement in three of the MLRISC
198    backpatch files used to be hard-wired to one of two possibilities at
199    link time (according to the value of the placementFlag).  But (I
200    think) it should rather sense the flag every time.
202    *****************************************
204    Other assorted changes (by other people who did not supply a HISTORY entry):
206    1. the cross-module inliner now works much better (Monnier)
207    2. representation of weights, frequencies, and probabilities in MLRISC
208       changed in preparation of using those for weighted block placement
209       (Reppy, George)
211    ----------------------------------------------------------------------
212    Name: Lal George
213    Date: 2002/03/07 14:44:24 EST 2002
214    Tag: george-20020307-weighted-block-placement
216    Tested the weighted block placement optimization on all architectures
217    (except the hppa) using AMPL to generate the block and edge frequencies.
218    Changes were required in the machine properties to correctly
219    categorize trap instructions. There is an MLRISC flag
220    "weighted-block-placement" that can be used to enable weighted block
221    placement, but this will be ineffective without block/edge
222    frequencies (coming soon).
225  ----------------------------------------------------------------------  ----------------------------------------------------------------------
226  Name: Lal George  Name: Lal George
227  Date: 2002/03/05 17:24:48 EST  Date: 2002/03/05 17:24:48 EST
228  Tag: george-20020305-linkage-cluster  Tag: george-20020305-linkage-cluster
230  In order to support the block placement optimization, the first  In order to support the block placement optimization, a new cluster
231  cluster that is generated (called the linkage cluster) contains a jump  is generated as the very first cluster (called the linkage cluster).
232  to the entry point for the compilation unit. The linkage cluster  It contains a single jump to the 'real' entry point for the compilation
233  contains only one function, so block placement will have no effect on  unit. Block placement has no effect on the linkage cluster itself, but
234  the linkage cluster itself, but all the other clusters have full  all the other clusters  have full freedom in the manner in which they
235  freedom in the manner in which they reorder blocks or functions.  reorder blocks or functions.
237  On the x86 the typical linkage code that is generated is:  On the x86 the typical linkage code that is generated is:
238     ----------------------     ----------------------
239          .align 2          .align 2
240     L0:     L0:
241          addl    $L1-L0, 72(%esp)          addl    $L1-L0, 72(%esp)
242          jmp     L0          jmp     L1
245          .align  2          .align  2
# Line 38  Line 248 
249  72(%esp) is the memory location for the stdlink register. This  72(%esp) is the memory location for the stdlink register. This
250  must contain the address of the CPS function being called. In the  must contain the address of the CPS function being called. In the
251  above example, it contains the address of memory for  L0; before  above example, it contains the address of  L0; before
252  calling L1 (the real entry point for the compilation unit), it  calling L1 (the real entry point for the compilation unit), it
253  must contain the address for L1, and hence  must contain the address for L1, and hence
255          addl $L1-L0, 72(%esp)          addl $L1-L0, 72(%esp)
257  I have tested this on all architectures except the hppa.  I have tested this on all architectures except the hppa.The increase
258    in code size is of course negligible
260  ----------------------------------------------------------------------  ----------------------------------------------------------------------
261  Name: Allen Leung  Name: Allen Leung

