Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1098, Wed Feb 27 17:29:47 2002 UTC revision 1183, Fri Mar 29 19:09:48 2002 UTC
# Line 14  Line 14 
14    
15  ----------------------------------------------------------------------  ----------------------------------------------------------------------
16  Name: Matthias Blume  Name: Matthias Blume
17    Date: 2002/03/29 14:10:00 EST
18    Tag: blume-20020329-inlprims
19    Description:
20    
21    NEW BOOTFILES!!!    Version number bumped to 110.39.3.
22    
23    Primops have changed. This means that the bin/boot-file formats have
24    changed as well.
25    
26    To make sure that there is no confusion, I made a new version.
27    
28    
29    CHANGES:
30    
31    * removed REMT from mltree (remainder should never overflow).
32    
33    * added primops to deal with divisions of all flavors to the frontend
34    
35    * handled these primops all the way through so they map to their respective
36      MLRISC support
37    
38    * used these primops in the implementation of Int, Int32, Word, Word32
39    
40    * removed INLDIV, INLMOD, and INLREM as they are no longer necessary
41    
42    * parameterized INLMIN, INLMAX, and INLABS by a numkind
43    
44    * translate.sml now deals with all flavors of INL{MIN,MAX,ABS}, including
45      floating point
46    
47    * used INL{MIN,MAX,ABS} in the implementation of Int, Int32, Word, Word32,
48      and Real (but Real.abs maps to a separate floating-point-only primop)
49    
50    
51    TODO items:
52    
53    * Hacked Alpha32 instruction selection, disabling the selection of REMx
54      instructions because the machine instruction encoder cannot handle
55      them.  (Hppa, PPC, and Sparc instruction selection did not handle
56      REM in the first place, and REM is supported by the x86 machine coder.)
57    
58    * Handle DIV and MOD with DIV_TO_NEGINF directly in the x86 instruction
59      selection phase.  (The two can be streamlined because the hardware
60      delivers both quotient and remainder at the same time anyway.)
61    
62    * Think about what to do with "valOf(Int32.minInt) div ~1" and friends.
63      (Currently the behavior is inconsistent both across architectures and
64      wrt. the draft Basis spec.)
65    
66    * Word8 should eventually be handled natively, too.
67    
68    * There seems to be one serious bug in mltree-gen.sml.  It appears, though,
69      as if there currently is no execution path that could trigger it in
70      SML/NJ.  (The assumptions underlying functions arith and promotable do not
71      hold for things like multiplication and division.)
72    
73    ----------------------------------------------------------------------
74    Name: Matthias Blume
75    Date: 2002/03/27 16:27:00 EST
76    Tag: blume-20020327-mlrisc-divisions
77    Description:
78    
79    Added support for all four division operations (ML's div, mod, quot,
80    and rem) to MLRISC.  In the course of doing so, I also rationalized
81    the naming (no more annoying switch-around of DIV and QUOT), by
82    parameterizing the operation by div_rounding_mode (which can be either
83    DIV_TO_ZERO or DIV_TO_NEGINF).
84    
85    The generic MLTreeGen functor takes care of compiling all four
86    operations down to only round-to-zero div.
87    
88    Missing pieces:
89    
90      * Doing something smarter than relying on MLTreeGen on architectures
91        like, e.g., the x86 where hardware division delivers both quotient and
92        remainder at the same time.  With this, the implementation of the
93        round-to-neginf operations could be further streamlined.
94    
95      * Remove inlining support for div/mod/rem from the frontend and replace it
96        with primops that get carried through to the backend.  Do this for all
97        int and word types.
98    
99    ----------------------------------------------------------------------
100    Name: Matthias Blume
101    Date: 2002/03/25 17:25:00 EST
102    Tag: blume-20020325-divmod
103    Description:
104    
105    I improved (hopefully without breaking them) the implementation of Int.div,
106    Int.mod, and Int.rem.   For this, the code in translate.sml now takes
107    advantage of the following observations:
108    
109      Let  q = x quot y      r = x rem y
110           d = x div  y      m = x mod y
111    
112    where "quot" is the round-to-zero version of integer division that
113    hardware usually provides.  Then we have:
114    
115         r = x - q * y        where neither the * nor the - will overflow
116         d = if q >= 0 orelse x = q * y then q else q - 1
117                              where neither the * nor the - will overflow
118         m = if q >= 0 orelse r = 0 then r else r + y
119                              where the + will not overflow
120    
121    This results in substantial simplification of the generated code.
122    The following table shows the number of CFG nodes and edges generated
123    for
124            fun f (x, y) = x OPER y
125            (* with OPER \in div, mod, quot, rem *)
126    
127    
128        OPER | nodes(old) | edges(old) | nodes(new) | edges(new)
129        --------------------------------------------------------
130         div |         24 |         39 |         12 |         16
131         mod |         41 |         71 |         12 |         16
132        quot |          8 |         10 |          8 |         10
133         rem |         10 |         14 |          8 |         10
134    
135    
136    ----------------------------------------------------------------------
137    Name: Matthias Blume
138    Date: 2002/03/25 22:06:00 EST
139    Tag: blume-20020325-cprotobug
140    Description:
141    
142    Fixed a bug in cproto (c prototype decoder).
143    
144    ----------------------------------------------------------------------
145    Name: Matthias Blume
146    Date: 2002/03/25 16:00:00 EST
147    Tag: blume-20020325-raw-primops
148    Description:
149    
150    I did some cleanup to Allen's new primop code and
151    replaced yesterday's bootfiles with new ones.
152    (But they are stored in the same place.)
153    
154    ----------------------------------------------------------------------
155    Name: Matthias Blume
156    Date: 2002/03/24 22:40:00 EST
157    Tag: blume-20020324-bootfiles
158    Description:
159    
160    Made the bootfiles that Allen asked for.
161    
162    ----------------------------------------------------------------------
163    Name: Allen Leung
164    Date: 2002/03/23 15:50:00 EST
165    Tag: leunga-20020323-flint-cps-rcc-primops
166    Description:
167    
168      1. Changes to FLINT primops:
169    
170        (* make a call to a C-function;
171         * The primop carries C function prototype information and specifies
172         * which of its (ML-) arguments are floating point. C prototype
173         * information is for use by the backend, ML information is for
174         * use by the CPS converter. *)
175      | RAW_CCALL of { c_proto: CTypes.c_proto,
176                       ml_args: ccall_type list,
177                       ml_res_opt: ccall_type option,
178                       reentrant : bool
179                     } option
180       (* Allocate uninitialized storage on the heap.
181        * The record is meant to hold short-lived C objects, i.e., they
182        * are not ML pointers.  With the tag, the representation is
183        * the same as RECORD with tag tag_raw32 (sz=4), or tag_fblock (sz=8)
184        *)
185      | RAW_RECORD of {tag:bool,sz:int}
186      and ccall_type = CCALL_INT32 | CCALL_REAL64 | CCALL_ML_PTR
187    
188      2.  These CPS primops are now overloaded:
189    
190           rawload of {kind:numkind}
191           rawstore of {kind:numkind}
192    
193          The one argument form is:
194    
195             rawload {kind} address
196    
197          The two argument form is:
198    
199             rawload {kind} [ml object, byte-offset]
200    
201      3. RAW_CCALL/RCC now takes two extra arguments:
202    
203         a. The first is whether the C call is reentrant, i.e., whether
204            ML state should be saved and restored.
205         b. The second argument is a string argument specifying the name of
206            library and the C function.
207    
208         These things are currently not handled in the code generator, yet.
209    
210      4. In CProto,
211    
212         An encoding type of "bool" means "ml object" and is mapped into
213         C prototype of PTR.  Note that "bool" is different than "string",
214         even though "string" is also mapped into PTR, because "bool"
215         is assigned an CPS type of BOGt, while "string" is assigned INT32t.
216    
217      5. Pickler/unpicker
218    
219         Changed to handle RAW_RECORD and newest RAW_CCALL
220    
221      6. MLRiscGen,
222    
223         1. Changed to handle the new rawload/rawstore/rawrecord operators.
224         2. Code for handling C Calls has been moved to a new module CPSCCalls,
225            in the file CodeGen/cpscompile/cps-c-calls.sml
226    
227      7. Added the conditional move operator
228    
229             condmove of branch
230    
231         to cps.  Generation of this is still buggy so it is currently
232         disabled.
233    
234    ----------------------------------------------------------------------
235    Name: Lal George
236    Date: 2002/03/22 14:18:25 EST
237    Tag: george-20020322-cps-branch-prob
238    Description:
239    
240    Implemented the Ball-Larus branch prediction-heuristics, and
241    incorporated graphical viewers for control flow graphs.
242    
243    Ball-Larus Heuristics:
244    ---------------------
245    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
246    
247    By design it uses the Dempster-Shafer theory for combining
248    probabilities.  For example, in the function:
249    
250        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
251    
252    the ball-larus heuristics predicts that the n=0 is unlikely
253    (OH-heuristic), and the 'then' branch is unlikely because of the
254    RH-heuristic -- giving the 'then' branch an even lower combined
255    probability using the Dempster-Shafer theory.
256    
257    Finally, John Reppy's loop analysis in MLRISC, further lowers the
258    probability of the 'then' branch because of the loop in the else
259    branch.
260    
261    
262    Graphical Viewing:
263    ------------------
264    I merely plugged in Allen's graphical viewers into the compiler. The
265    additional code is not much. At the top level, saying:
266    
267            Control.MLRISC.getFlag "cfg-graphical-view" := true;
268    
269    will display the graphical view of the control flow graph just before
270    back-patching.  daVinci must be in your path for this to work. If
271    daVinci is not available, then the default viewer can be changed
272    using:
273    
274            Control.MLRISC.getString "viewer"
275    
276    which can be set to "dot" or "vcg" for the corresponding viewers. Of
277    course, these viewers must be in your path.
278    
279    The above will display the compilation unit at the level of clusters,
280    many of which are small, boring, and un-interesting. Also setting:
281    
282            Control.MLRISC.getInt "cfg-graphical-view_size"
283    
284    will display clusters that are larger than the value set by the above.
285    
286    
287    ----------------------------------------------------------------------
288    Name: Matthias Blume
289    Date: 2002/03/21 22:20:00 EST
290    Tag: blume-20020321-kmp-bugfix
291    Description:
292    
293    Changed the interface to the KMP routine in PreString and fixed
294    a minor bug in one place where it was used.
295    
296    ----------------------------------------------------------------------
297    Name: Allen Leung
298    Date: 2002/03/21 20:30:00 EST
299    Tag: leunga-20020321-cfg
300    Description:
301    
302      Fixed a potential problem in cfg edge splitting.
303    
304    ----------------------------------------------------------------------
305    Name: Allen Leung
306    Date: 2002/03/21 17:15:00 EST
307    Tag: leunga-20020321-x86-fp-cfg
308    Description:
309    
310      1. Recoded the buggy parts of x86-fp.
311    
312         a. All the block reordering code has been removed.
313            We now depend on the block placement phases to do this work.
314    
315         b. Critical edge splitting code has been simplified and moved into the
316            CFG modules, as where they belong.
317    
318         Both of these were quite buggy and complex.  The code is now much, much
319         simpler.
320    
321      2. X86 backend.
322    
323         a. Added instructions for 64-bit support.  Instruction selection for
324            64-bit has not been committed, however, since that
325            requires changes to MLTREE which haven't been approved by
326            Lal and John.
327    
328         b. Added support for FUCOMI and FUCOMIP when generating code for
329            PentiumPro and above.  We only generate these instructions in
330            the fast-fp mode.
331    
332         c. Added cases for JP and JNP in X86FreqProps.
333    
334      3. CFG
335    
336         CFG now has a bunch of methods for edge splitting and merging.
337    
338      4. Machine description.
339    
340         John's simplification of MLTREE_BASIS.fcond broke a few machine
341         description things:
342    
343         rtl-build.{sig,sml} and hppa.mdl fixed.
344    
345         NOTE: the machine description stuff in the repository is still broken.
346               Again, I can't put my fixes in because that involves
347               changes to MLTREE.
348    
349    ----------------------------------------------------------------------
350    Name: Matthias Blume
351    Date: 2002/03/20 15:55:00 EST
352    Tag: blume-20020320-kmp
353    Description:
354    
355    Implemented Knuth-Morris-Pratt string matching in PreString and used
356    it for String.isSubstring, Substring.isSubstring, and
357    Substring.position.
358    
359    (Might need some stress-testing.  Simple examples worked fine.)
360    
361    ----------------------------------------------------------------------
362    Name: Matthias Blume
363    Date: 2002/03/19 16:37:00 EST
364    Tag: blume-20020319-witnesses
365    Description:
366    
367    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
368    
369    This implements a generic mechanism for changing constness qualifiers
370    anywhere within big C types without resorting to outright "casts".
371    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
372    modify the constness at the outermost level.)
373    The implementation of "convert" is based on the idea of "witness"
374    values -- values that are not used by the operation but whose types
375    "testify" to their applicability.  On the implementation side, "convert"
376    is simply a projection (returning its second curried argument).  With
377    cross-module inlining, it should not result in any machine code being
378    generated.
379    
380    ----------------------------------------------------------------------
381    Name: Matthias Blume
382    Date: 2002/03/15 16:40:00 EST
383    Tag: blume-20020315-basis
384    Description:
385    
386    Provided (preliminary?) implementations for
387    
388      {String,Substring}.{concatWith,isSuffix,isSubstring}
389    
390    and
391    
392      Substring.full
393    
394    Those are in the Basis spec but they were missing in SML/NJ.
395    
396    ----------------------------------------------------------------------
397    Name: Matthias Blume
398    Date: 2002/03/14 21:30:00 EST
399    Tag: blume-20020314-controls
400    Description:
401    
402    Controls:
403    ---------
404    
405    1. Factored out the recently-added Controls : CONTROLS stuff and put
406       it into its own library $/controls-lib.cm.  The source tree for
407       this is under src/smlnj-lib/Controls.
408    
409    2. Changed the names of types and functions in this interface, so they
410       make a bit more "sense":
411    
412          module -> registry
413          'a registry -> 'a group
414    
415    3. The interface now deals in ref cells only.  The getter/setter interface
416       is (mostly) gone.
417    
418    4. Added a function that lets one register an already-existing ref cell.
419    
420    5. Made the corresponding modifications to the rest of the code so that
421       everything compiles again.
422    
423    6. Changed the implementation of Controls.MLRISC back to something closer
424       to the original.  In particular, this module (and therefore MLRISC)
425       does not depend on Controls.  There now is some link-time code in
426       int-sys.sml that registers the MLRISC controls with the Controls
427       module.
428    
429    CM:
430    ---
431    
432      * One can now specify the lambda-split aggressiveness in init.cmi.
433    
434    ----------------------------------------------------------------------
435    Name: Allen Leung
436    Date: 2002/03/13 17:30:00 EST
437    Tag: leunga-20020313-x86-fp-unary
438    Description:
439    
440    Bug fix for:
441    
442    > leunga@weaselbane:~/Yale/tmp/sml-dist{21} bin/sml
443    > Standard ML of New Jersey v110.39.1 [FLINT v1.5], March 08, 2002
444    > - fun f(x,(y,z)) = Real.~ y;
445    > [autoloading]
446    > [autoloading done]
447    >       fchsl   (%eax), 184(%esp)
448    > Error: MLRisc bug: X86MCEmitter.emitInstr
449    >
450    > uncaught exception Error
451    >   raised at: ../MLRISC/control/mlriscErrormsg.sml:16.14-16.19
452    
453    The problem was that the code generator did not generate any fp registers
454    in this case, and the ra didn't know that it needed to run the X86FP phase to
455    translate the pseudo fp instruction.   This only happened with unary fp
456    operators in certain situations.
457    
458    ----------------------------------------------------------------------
459    Name: Matthias Blume
460    Date: 2002/03/13 14:00:00 EST
461    Tag: blume-20020313-overload-etc
462    Description:
463    
464    1. Added _overload as a synonym for overload for backward compatibility.
465       (Control.overloadKW must be true for either version to be accepted.)
466    
467    2. Fixed bug in install script that caused more things to be installed
468       than what was requested in config/targets.
469    
470    3. Made CM aware of the (_)overload construct so that autoloading
471       works.
472    
473    ----------------------------------------------------------------------
474    Name: Matthias Blume
475    Date: 2002/03/12 22:03:00 EST
476    Tag: blume-20020312-url
477    Description:
478    
479    Forgot to update BOOT and srcarchiveurl.
480    
481    ----------------------------------------------------------------------
482    Name: Matthias Blume
483    Date: 2002/03/12 17:30:00 EST
484    Tag: blume-20020312-version110392
485    Description:
486    
487    Yet another version number bump (because of small changes to the
488    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
489    
490    Changes:
491    
492      The new pid generation scheme described a few weeks ago was overly
493      complicated.  I implemented a new mechanism that is simpler and
494      provides a bit more "stability":  Once CM has seen a compilation
495      unit, it keeps its identity constant (as long as you do not delete
496      those crucial CM/GUID/* files).  This means that when you change
497      an interface, compile, then go back to the old interface, and
498      compile again, you arrive at the original pid.
499    
500      There now also is a mechanism that instructs CM to use the plain
501      environment hash as a module's pid (effectively making its GUID
502      the empty string).  For this, "noguid" must be specified as an
503      option to the .sml file in question within its .cm file.
504      This is most useful for code that is being generated by tools such
505      as ml-nlffigen (because during development programmers tend to
506      erase the tool's entire output directory tree including CM's cached
507      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
508      revert to the old, broken behavior of SML/NJ, but in specific cases
509      where there is no danger of interface confusion, its use is ok
510      (I think).
511    
512      ml-nlffigen by default generates "noguid" annotations.  They can be
513      turned off by specifying -guid in its command line.
514    
515    ----------------------------------------------------------------------
516    Name: Lal George
517    Date: 2002/03/12 12 14:42:36 EST
518    Tag: george-20020312-frequency-computation
519    Description:
520    
521    Integrated jump chaining and static block frequency into the
522    compiler. More details and numbers later.
523    
524    ----------------------------------------------------------------------
525    Name: Lal George
526    Date: 2002/03/11 11 22:38:53 EST
527    Tag: george-20020311-jump-chain-elim
528    Description:
529    
530    Tested the jump chain elimination on all architectures (except the
531    hppa).  This is on by default right now and is profitable for the
532    alpha and x86, however, it may not be profitable for the sparc and ppc
533    when compiling the compiler.
534    
535    The gc test will typically jump to a label at the end of the cluster,
536    where there is another jump to an external cluster containing the actual
537    code to invoke gc. This is to allow factoring of common gc invocation
538    sequences. That is to say, we generate:
539    
540            f:
541               testgc
542               ja   L1      % jump if above to L1
543    
544            L1:
545               jmp L2
546    
547    
548    After jump chain elimination the 'ja L1' instructions is converted to
549    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
550    up being implemented in their long form (if L2 is far away) using:
551    
552            jbe     L3      % jump if below or equal to L3
553            jmp     L2
554         L3:
555            ...
556    
557    
558    For large compilation units L2  may be far away.
559    
560    
561    ----------------------------------------------------------------------
562    Name: Matthias Blume
563    Date: 2002/03/11 13:30:00 EST
564    Tag: blume-20020311-mltreeeval
565    Description:
566    
567    A functor parameter was missing.
568    
569    ----------------------------------------------------------------------
570    Name: Allen Leung
571    Date: 2002/03/11 10:30:00 EST
572    Tag: leunga-20020311-runtime-string0
573    Description:
574    
575       The representation of the empty string now points to a
576    legal null terminated C string instead of unit.  It is now possible
577    to convert an ML string into C string with InlineT.CharVector.getData.
578    This compiles into one single machine instruction.
579    
580    ----------------------------------------------------------------------
581    Name: Allen Leung
582    Date: 2002/03/10 23:55:00 EST
583    Tag: leunga-20020310-x86-call
584    Description:
585    
586       Added machine generation for CALL instruction (relative displacement mode)
587    
588    ----------------------------------------------------------------------
589    Name: Matthias Blume
590    Date: 2002/03/08 16:05:00
591    Tag: blume-20020308-entrypoints
592    Description:
593    
594    Version number bumped to 110.39.1.  NEW BOOTFILES!
595    
596    Entrypoints: non-zero offset into a code object where execution should begin.
597    
598    - Added the notion of an entrypoint to CodeObj.
599    - Added reading/writing of entrypoint info to Binfile.
600    - Made runtime system bootloader aware of entrypoints.
601    - Use the address of the label of the first function given to mlriscGen
602      as the entrypoint.  This address is currently always 0, but it will
603      not be 0 once we turn on block placement.
604    - Removed the linkage cluster code (which was The Other Way(tm) of dealing
605      with entry points) from mlriscGen.
606    
607    ----------------------------------------------------------------------
608    Name: Allen Leung
609    Date: 2002/03/07 20:45:00 EST
610    Tag: leunga-20020307-x86-cmov
611    Description:
612    
613       Bug fixes for CMOVcc on x86.
614    
615       1. Added machine code generation for CMOVcc
616       2. CMOVcc is now generated in preference over SETcc on PentiumPro or above.
617       3. CMOVcc cannot have an immediate operand as argument.
618    
619    ----------------------------------------------------------------------
620    Name: Matthias Blume
621    Date: 2002/03/07 16:15:00 EST
622    Tag: blume-20020307-controls
623    Description:
624    
625    This is a very large but mostly boring patch which makes (almost)
626    every tuneable compiler knob (i.e., pretty much everything under
627    Control.* plus a few other things) configurable via both the command
628    line and environment variables in the style CM did its configuration
629    until now.
630    
631    Try starting sml with '-h' (or, if you are brave, '-H')
632    
633    To this end, I added a structure Controls : CONTROLS to smlnj-lib.cm which
634    implements the underlying generic mechanism.
635    
636    The interface to some of the existing such facilities has changed somewhat.
637    For example, the MLRiscControl module now provides mkFoo instead of getFoo.
638    (The getFoo interface is still there for backward-compatibility, but its
639    use is deprecated.)
640    
641    The ml-build script passes -Cxxx=yyy command-line arguments through so
642    that one can now twiddle the compiler settings when using this "batch"
643    compiler.
644    
645    TODO items:
646    
647    We should go through and throw out all controls that are no longer
648    connected to anything.  Moreover, we should go through and provide
649    meaningful (and correct!) documentation strings for those controls
650    that still are connected.
651    
652    Currently, multiple calls to Controls.new are accepted (only the first
653    has any effect).  Eventually we should make sure that every control
654    is being made (via Controls.new) exactly once.  Future access can then
655    be done using Controls.acc.
656    
657    Finally, it would probably be a good idea to use the getter-setter
658    interface to controls rather than ref cells.  For the time being, both
659    styles are provided by the Controls module, but getter-setter pairs are
660    better if thread-safety is of any concern because they can be wrapped.
661    
662    *****************************************
663    
664    One bug fix: The function blockPlacement in three of the MLRISC
665    backpatch files used to be hard-wired to one of two possibilities at
666    link time (according to the value of the placementFlag).  But (I
667    think) it should rather sense the flag every time.
668    
669    *****************************************
670    
671    Other assorted changes (by other people who did not supply a HISTORY entry):
672    
673    1. the cross-module inliner now works much better (Monnier)
674    2. representation of weights, frequencies, and probabilities in MLRISC
675       changed in preparation of using those for weighted block placement
676       (Reppy, George)
677    
678    ----------------------------------------------------------------------
679    Name: Lal George
680    Date: 2002/03/07 14:44:24 EST 2002
681    Tag: george-20020307-weighted-block-placement
682    
683    Tested the weighted block placement optimization on all architectures
684    (except the hppa) using AMPL to generate the block and edge frequencies.
685    Changes were required in the machine properties to correctly
686    categorize trap instructions. There is an MLRISC flag
687    "weighted-block-placement" that can be used to enable weighted block
688    placement, but this will be ineffective without block/edge
689    frequencies (coming soon).
690    
691    
692    ----------------------------------------------------------------------
693    Name: Lal George
694    Date: 2002/03/05 17:24:48 EST
695    Tag: george-20020305-linkage-cluster
696    
697    In order to support the block placement optimization, a new cluster
698    is generated as the very first cluster (called the linkage cluster).
699    It contains a single jump to the 'real' entry point for the compilation
700    unit. Block placement has no effect on the linkage cluster itself, but
701    all the other clusters  have full freedom in the manner in which they
702    reorder blocks or functions.
703    
704    On the x86 the typical linkage code that is generated is:
705       ----------------------
706            .align 2
707       L0:
708            addl    $L1-L0, 72(%esp)
709            jmp     L1
710    
711    
712            .align  2
713       L1:
714       ----------------------
715    
716    72(%esp) is the memory location for the stdlink register. This
717    must contain the address of the CPS function being called. In the
718    above example, it contains the address of  L0; before
719    calling L1 (the real entry point for the compilation unit), it
720    must contain the address for L1, and hence
721    
722            addl $L1-L0, 72(%esp)
723    
724    I have tested this on all architectures except the hppa.The increase
725    in code size is of course negligible
726    
727    ----------------------------------------------------------------------
728    Name: Allen Leung
729    Date: 2002/03/03 13:20:00 EST
730    Tag: leunga-20020303-mlrisc-tools
731    
732      Added #[ ... ] expressions to mlrisc tools
733    
734    ----------------------------------------------------------------------
735    Name: Matthias Blume
736  Date: 2002/02/27 12:29:00 EST  Date: 2002/02/27 12:29:00 EST
737  Tag: blume-20020227-cdebug  Tag: blume-20020227-cdebug
738  Description:  Description:
# Line 134  Line 853 
853         TOTAL                                   2375.26u  57.21s  48.00g         TOTAL                                   2375.26u  57.21s  48.00g
854    
855  As you can see the old cps spill module suffers from some serious  As you can see the old cps spill module suffers from some serious
856  performance problem but since I cannot decipher the old code fully,  performance problem.  But since I cannot decipher the old code fully,
857  innstead of patching the problems up, I'm reimplementing it  instead of patching the problems up, I'm reimplementing it
858  with a different algorithm.  The new code is more modular,  with a different algorithm.  The new code is more modular,
859  smaller when compiled, and substantially faster  smaller when compiled, and substantially faster
860  (O(n log n) time and O(n) space).  Timing of the new spill module:  (O(n log n) time and O(n) space).  Timing of the new spill module:

Legend:
Removed from v.1098  
changed lines
  Added in v.1183

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0