Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1130, Mon Mar 11 04:49:41 2002 UTC revision 1185, Mon Apr 1 22:06:47 2002 UTC
# Line 13  Line 13 
13  Description:  Description:
14    
15  ----------------------------------------------------------------------  ----------------------------------------------------------------------
16    Name: Matthias Blume
17    Date: 2002/04/01 (no joke!) 17:07:00 EST
18    Tag: blume-20020401-x86div
19    Description:
20    
21    Added full support for div/mod/rem/quot on the x86, using the machine
22    instruction's two results (without clumsily recomputing the remainder)
23    directly where appropriate.
24    
25    Some more extensive power-of-two support was added to the x86 instruction
26    selector (avoiding expensive divs, mods, and muls where they can be
27    replaced with cheaper shifts and masks).  However, this sort of thing
28    ought to be done earlier, e.g., within the CPS optimizer so that
29    all architectures benefit from it.
30    
31    The compiler compiles to a fixed point, but changes might be somewhat
32    fragile nevertheless.  Please, report any strange things that you might
33    see wrt. div/mod/quot/rem...
34    
35    ----------------------------------------------------------------------
36    Name: Matthias Blume
37    Date: 2002/03/29 17:22:00
38    Tag: blume-20020329-div
39    Description:
40    
41    Fixed my broken div/mod logic.  Unfortunately, this means that the
42    inline code for div/mod now has one more comparison than before.
43    Fast paths (quotient > 0 or remainder = 0) are not affected, though.
44    The problem was with quotient = 0, because that alone does not tell
45    us which way the rounding went.  One then has to look at whether
46    remainder and divisor have the same sign...  :(
47    
48    Anyway, I replaced the bootfiles with fresh ones...
49    
50    ----------------------------------------------------------------------
51    Name: Matthias Blume
52    Date: 2002/03/29 14:10:00 EST
53    Tag: blume-20020329-inlprims
54    Description:
55    
56    NEW BOOTFILES!!!    Version number bumped to 110.39.3.
57    
58    Primops have changed. This means that the bin/boot-file formats have
59    changed as well.
60    
61    To make sure that there is no confusion, I made a new version.
62    
63    
64    CHANGES:
65    
66    * removed REMT from mltree (remainder should never overflow).
67    
68    * added primops to deal with divisions of all flavors to the frontend
69    
70    * handled these primops all the way through so they map to their respective
71      MLRISC support
72    
73    * used these primops in the implementation of Int, Int32, Word, Word32
74    
75    * removed INLDIV, INLMOD, and INLREM as they are no longer necessary
76    
77    * parameterized INLMIN, INLMAX, and INLABS by a numkind
78    
79    * translate.sml now deals with all flavors of INL{MIN,MAX,ABS}, including
80      floating point
81    
82    * used INL{MIN,MAX,ABS} in the implementation of Int, Int32, Word, Word32,
83      and Real (but Real.abs maps to a separate floating-point-only primop)
84    
85    
86    TODO items:
87    
88    * Hacked Alpha32 instruction selection, disabling the selection of REMx
89      instructions because the machine instruction encoder cannot handle
90      them.  (Hppa, PPC, and Sparc instruction selection did not handle
91      REM in the first place, and REM is supported by the x86 machine coder.)
92    
93    * Handle DIV and MOD with DIV_TO_NEGINF directly in the x86 instruction
94      selection phase.  (The two can be streamlined because the hardware
95      delivers both quotient and remainder at the same time anyway.)
96    
97    * Think about what to do with "valOf(Int32.minInt) div ~1" and friends.
98      (Currently the behavior is inconsistent both across architectures and
99      wrt. the draft Basis spec.)
100    
101    * Word8 should eventually be handled natively, too.
102    
103    * There seems to be one serious bug in mltree-gen.sml.  It appears, though,
104      as if there currently is no execution path that could trigger it in
105      SML/NJ.  (The assumptions underlying functions arith and promotable do not
106      hold for things like multiplication and division.)
107    
108    ----------------------------------------------------------------------
109    Name: Matthias Blume
110    Date: 2002/03/27 16:27:00 EST
111    Tag: blume-20020327-mlrisc-divisions
112    Description:
113    
114    Added support for all four division operations (ML's div, mod, quot,
115    and rem) to MLRISC.  In the course of doing so, I also rationalized
116    the naming (no more annoying switch-around of DIV and QUOT), by
117    parameterizing the operation by div_rounding_mode (which can be either
118    DIV_TO_ZERO or DIV_TO_NEGINF).
119    
120    The generic MLTreeGen functor takes care of compiling all four
121    operations down to only round-to-zero div.
122    
123    Missing pieces:
124    
125      * Doing something smarter than relying on MLTreeGen on architectures
126        like, e.g., the x86 where hardware division delivers both quotient and
127        remainder at the same time.  With this, the implementation of the
128        round-to-neginf operations could be further streamlined.
129    
130      * Remove inlining support for div/mod/rem from the frontend and replace it
131        with primops that get carried through to the backend.  Do this for all
132        int and word types.
133    
134    ----------------------------------------------------------------------
135    Name: Matthias Blume
136    Date: 2002/03/25 17:25:00 EST
137    Tag: blume-20020325-divmod
138    Description:
139    
140    I improved (hopefully without breaking them) the implementation of Int.div,
141    Int.mod, and Int.rem.   For this, the code in translate.sml now takes
142    advantage of the following observations:
143    
144      Let  q = x quot y      r = x rem y
145           d = x div  y      m = x mod y
146    
147    where "quot" is the round-to-zero version of integer division that
148    hardware usually provides.  Then we have:
149    
150         r = x - q * y        where neither the * nor the - will overflow
151         d = if q >= 0 orelse x = q * y then q else q - 1
152                              where neither the * nor the - will overflow
153         m = if q >= 0 orelse r = 0 then r else r + y
154                              where the + will not overflow
155    
156    This results in substantial simplification of the generated code.
157    The following table shows the number of CFG nodes and edges generated
158    for
159            fun f (x, y) = x OPER y
160            (* with OPER \in div, mod, quot, rem *)
161    
162    
163        OPER | nodes(old) | edges(old) | nodes(new) | edges(new)
164        --------------------------------------------------------
165         div |         24 |         39 |         12 |         16
166         mod |         41 |         71 |         12 |         16
167        quot |          8 |         10 |          8 |         10
168         rem |         10 |         14 |          8 |         10
169    
170    
171    ----------------------------------------------------------------------
172    Name: Matthias Blume
173    Date: 2002/03/25 22:06:00 EST
174    Tag: blume-20020325-cprotobug
175    Description:
176    
177    Fixed a bug in cproto (c prototype decoder).
178    
179    ----------------------------------------------------------------------
180    Name: Matthias Blume
181    Date: 2002/03/25 16:00:00 EST
182    Tag: blume-20020325-raw-primops
183    Description:
184    
185    I did some cleanup to Allen's new primop code and
186    replaced yesterday's bootfiles with new ones.
187    (But they are stored in the same place.)
188    
189    ----------------------------------------------------------------------
190    Name: Matthias Blume
191    Date: 2002/03/24 22:40:00 EST
192    Tag: blume-20020324-bootfiles
193    Description:
194    
195    Made the bootfiles that Allen asked for.
196    
197    ----------------------------------------------------------------------
198    Name: Allen Leung
199    Date: 2002/03/23 15:50:00 EST
200    Tag: leunga-20020323-flint-cps-rcc-primops
201    Description:
202    
203      1. Changes to FLINT primops:
204    
205        (* make a call to a C-function;
206         * The primop carries C function prototype information and specifies
207         * which of its (ML-) arguments are floating point. C prototype
208         * information is for use by the backend, ML information is for
209         * use by the CPS converter. *)
210      | RAW_CCALL of { c_proto: CTypes.c_proto,
211                       ml_args: ccall_type list,
212                       ml_res_opt: ccall_type option,
213                       reentrant : bool
214                     } option
215       (* Allocate uninitialized storage on the heap.
216        * The record is meant to hold short-lived C objects, i.e., they
217        * are not ML pointers.  With the tag, the representation is
218        * the same as RECORD with tag tag_raw32 (sz=4), or tag_fblock (sz=8)
219        *)
220      | RAW_RECORD of {tag:bool,sz:int}
221      and ccall_type = CCALL_INT32 | CCALL_REAL64 | CCALL_ML_PTR
222    
223      2.  These CPS primops are now overloaded:
224    
225           rawload of {kind:numkind}
226           rawstore of {kind:numkind}
227    
228          The one argument form is:
229    
230             rawload {kind} address
231    
232          The two argument form is:
233    
234             rawload {kind} [ml object, byte-offset]
235    
236      3. RAW_CCALL/RCC now takes two extra arguments:
237    
238         a. The first is whether the C call is reentrant, i.e., whether
239            ML state should be saved and restored.
240         b. The second argument is a string argument specifying the name of
241            library and the C function.
242    
243         These things are currently not handled in the code generator, yet.
244    
245      4. In CProto,
246    
247         An encoding type of "bool" means "ml object" and is mapped into
248         C prototype of PTR.  Note that "bool" is different than "string",
249         even though "string" is also mapped into PTR, because "bool"
250         is assigned an CPS type of BOGt, while "string" is assigned INT32t.
251    
252      5. Pickler/unpicker
253    
254         Changed to handle RAW_RECORD and newest RAW_CCALL
255    
256      6. MLRiscGen,
257    
258         1. Changed to handle the new rawload/rawstore/rawrecord operators.
259         2. Code for handling C Calls has been moved to a new module CPSCCalls,
260            in the file CodeGen/cpscompile/cps-c-calls.sml
261    
262      7. Added the conditional move operator
263    
264             condmove of branch
265    
266         to cps.  Generation of this is still buggy so it is currently
267         disabled.
268    
269    ----------------------------------------------------------------------
270    Name: Lal George
271    Date: 2002/03/22 14:18:25 EST
272    Tag: george-20020322-cps-branch-prob
273    Description:
274    
275    Implemented the Ball-Larus branch prediction-heuristics, and
276    incorporated graphical viewers for control flow graphs.
277    
278    Ball-Larus Heuristics:
279    ---------------------
280    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
281    
282    By design it uses the Dempster-Shafer theory for combining
283    probabilities.  For example, in the function:
284    
285        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
286    
287    the ball-larus heuristics predicts that the n=0 is unlikely
288    (OH-heuristic), and the 'then' branch is unlikely because of the
289    RH-heuristic -- giving the 'then' branch an even lower combined
290    probability using the Dempster-Shafer theory.
291    
292    Finally, John Reppy's loop analysis in MLRISC, further lowers the
293    probability of the 'then' branch because of the loop in the else
294    branch.
295    
296    
297    Graphical Viewing:
298    ------------------
299    I merely plugged in Allen's graphical viewers into the compiler. The
300    additional code is not much. At the top level, saying:
301    
302            Control.MLRISC.getFlag "cfg-graphical-view" := true;
303    
304    will display the graphical view of the control flow graph just before
305    back-patching.  daVinci must be in your path for this to work. If
306    daVinci is not available, then the default viewer can be changed
307    using:
308    
309            Control.MLRISC.getString "viewer"
310    
311    which can be set to "dot" or "vcg" for the corresponding viewers. Of
312    course, these viewers must be in your path.
313    
314    The above will display the compilation unit at the level of clusters,
315    many of which are small, boring, and un-interesting. Also setting:
316    
317            Control.MLRISC.getInt "cfg-graphical-view_size"
318    
319    will display clusters that are larger than the value set by the above.
320    
321    
322    ----------------------------------------------------------------------
323    Name: Matthias Blume
324    Date: 2002/03/21 22:20:00 EST
325    Tag: blume-20020321-kmp-bugfix
326    Description:
327    
328    Changed the interface to the KMP routine in PreString and fixed
329    a minor bug in one place where it was used.
330    
331    ----------------------------------------------------------------------
332    Name: Allen Leung
333    Date: 2002/03/21 20:30:00 EST
334    Tag: leunga-20020321-cfg
335    Description:
336    
337      Fixed a potential problem in cfg edge splitting.
338    
339    ----------------------------------------------------------------------
340    Name: Allen Leung
341    Date: 2002/03/21 17:15:00 EST
342    Tag: leunga-20020321-x86-fp-cfg
343    Description:
344    
345      1. Recoded the buggy parts of x86-fp.
346    
347         a. All the block reordering code has been removed.
348            We now depend on the block placement phases to do this work.
349    
350         b. Critical edge splitting code has been simplified and moved into the
351            CFG modules, as where they belong.
352    
353         Both of these were quite buggy and complex.  The code is now much, much
354         simpler.
355    
356      2. X86 backend.
357    
358         a. Added instructions for 64-bit support.  Instruction selection for
359            64-bit has not been committed, however, since that
360            requires changes to MLTREE which haven't been approved by
361            Lal and John.
362    
363         b. Added support for FUCOMI and FUCOMIP when generating code for
364            PentiumPro and above.  We only generate these instructions in
365            the fast-fp mode.
366    
367         c. Added cases for JP and JNP in X86FreqProps.
368    
369      3. CFG
370    
371         CFG now has a bunch of methods for edge splitting and merging.
372    
373      4. Machine description.
374    
375         John's simplification of MLTREE_BASIS.fcond broke a few machine
376         description things:
377    
378         rtl-build.{sig,sml} and hppa.mdl fixed.
379    
380         NOTE: the machine description stuff in the repository is still broken.
381               Again, I can't put my fixes in because that involves
382               changes to MLTREE.
383    
384    ----------------------------------------------------------------------
385    Name: Matthias Blume
386    Date: 2002/03/20 15:55:00 EST
387    Tag: blume-20020320-kmp
388    Description:
389    
390    Implemented Knuth-Morris-Pratt string matching in PreString and used
391    it for String.isSubstring, Substring.isSubstring, and
392    Substring.position.
393    
394    (Might need some stress-testing.  Simple examples worked fine.)
395    
396    ----------------------------------------------------------------------
397    Name: Matthias Blume
398    Date: 2002/03/19 16:37:00 EST
399    Tag: blume-20020319-witnesses
400    Description:
401    
402    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
403    
404    This implements a generic mechanism for changing constness qualifiers
405    anywhere within big C types without resorting to outright "casts".
406    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
407    modify the constness at the outermost level.)
408    The implementation of "convert" is based on the idea of "witness"
409    values -- values that are not used by the operation but whose types
410    "testify" to their applicability.  On the implementation side, "convert"
411    is simply a projection (returning its second curried argument).  With
412    cross-module inlining, it should not result in any machine code being
413    generated.
414    
415    ----------------------------------------------------------------------
416    Name: Matthias Blume
417    Date: 2002/03/15 16:40:00 EST
418    Tag: blume-20020315-basis
419    Description:
420    
421    Provided (preliminary?) implementations for
422    
423      {String,Substring}.{concatWith,isSuffix,isSubstring}
424    
425    and
426    
427      Substring.full
428    
429    Those are in the Basis spec but they were missing in SML/NJ.
430    
431    ----------------------------------------------------------------------
432    Name: Matthias Blume
433    Date: 2002/03/14 21:30:00 EST
434    Tag: blume-20020314-controls
435    Description:
436    
437    Controls:
438    ---------
439    
440    1. Factored out the recently-added Controls : CONTROLS stuff and put
441       it into its own library $/controls-lib.cm.  The source tree for
442       this is under src/smlnj-lib/Controls.
443    
444    2. Changed the names of types and functions in this interface, so they
445       make a bit more "sense":
446    
447          module -> registry
448          'a registry -> 'a group
449    
450    3. The interface now deals in ref cells only.  The getter/setter interface
451       is (mostly) gone.
452    
453    4. Added a function that lets one register an already-existing ref cell.
454    
455    5. Made the corresponding modifications to the rest of the code so that
456       everything compiles again.
457    
458    6. Changed the implementation of Controls.MLRISC back to something closer
459       to the original.  In particular, this module (and therefore MLRISC)
460       does not depend on Controls.  There now is some link-time code in
461       int-sys.sml that registers the MLRISC controls with the Controls
462       module.
463    
464    CM:
465    ---
466    
467      * One can now specify the lambda-split aggressiveness in init.cmi.
468    
469    ----------------------------------------------------------------------
470    Name: Allen Leung
471    Date: 2002/03/13 17:30:00 EST
472    Tag: leunga-20020313-x86-fp-unary
473    Description:
474    
475    Bug fix for:
476    
477    > leunga@weaselbane:~/Yale/tmp/sml-dist{21} bin/sml
478    > Standard ML of New Jersey v110.39.1 [FLINT v1.5], March 08, 2002
479    > - fun f(x,(y,z)) = Real.~ y;
480    > [autoloading]
481    > [autoloading done]
482    >       fchsl   (%eax), 184(%esp)
483    > Error: MLRisc bug: X86MCEmitter.emitInstr
484    >
485    > uncaught exception Error
486    >   raised at: ../MLRISC/control/mlriscErrormsg.sml:16.14-16.19
487    
488    The problem was that the code generator did not generate any fp registers
489    in this case, and the ra didn't know that it needed to run the X86FP phase to
490    translate the pseudo fp instruction.   This only happened with unary fp
491    operators in certain situations.
492    
493    ----------------------------------------------------------------------
494    Name: Matthias Blume
495    Date: 2002/03/13 14:00:00 EST
496    Tag: blume-20020313-overload-etc
497    Description:
498    
499    1. Added _overload as a synonym for overload for backward compatibility.
500       (Control.overloadKW must be true for either version to be accepted.)
501    
502    2. Fixed bug in install script that caused more things to be installed
503       than what was requested in config/targets.
504    
505    3. Made CM aware of the (_)overload construct so that autoloading
506       works.
507    
508    ----------------------------------------------------------------------
509    Name: Matthias Blume
510    Date: 2002/03/12 22:03:00 EST
511    Tag: blume-20020312-url
512    Description:
513    
514    Forgot to update BOOT and srcarchiveurl.
515    
516    ----------------------------------------------------------------------
517    Name: Matthias Blume
518    Date: 2002/03/12 17:30:00 EST
519    Tag: blume-20020312-version110392
520    Description:
521    
522    Yet another version number bump (because of small changes to the
523    binfile format).  Version number is now 110.39.2.  NEW BOOTFILES!
524    
525    Changes:
526    
527      The new pid generation scheme described a few weeks ago was overly
528      complicated.  I implemented a new mechanism that is simpler and
529      provides a bit more "stability":  Once CM has seen a compilation
530      unit, it keeps its identity constant (as long as you do not delete
531      those crucial CM/GUID/* files).  This means that when you change
532      an interface, compile, then go back to the old interface, and
533      compile again, you arrive at the original pid.
534    
535      There now also is a mechanism that instructs CM to use the plain
536      environment hash as a module's pid (effectively making its GUID
537      the empty string).  For this, "noguid" must be specified as an
538      option to the .sml file in question within its .cm file.
539      This is most useful for code that is being generated by tools such
540      as ml-nlffigen (because during development programmers tend to
541      erase the tool's entire output directory tree including CM's cached
542      GUIDs).  "noguid" is somewhat dangerous (since it can be used to locally
543      revert to the old, broken behavior of SML/NJ, but in specific cases
544      where there is no danger of interface confusion, its use is ok
545      (I think).
546    
547      ml-nlffigen by default generates "noguid" annotations.  They can be
548      turned off by specifying -guid in its command line.
549    
550    ----------------------------------------------------------------------
551    Name: Lal George
552    Date: 2002/03/12 12 14:42:36 EST
553    Tag: george-20020312-frequency-computation
554    Description:
555    
556    Integrated jump chaining and static block frequency into the
557    compiler. More details and numbers later.
558    
559    ----------------------------------------------------------------------
560    Name: Lal George
561    Date: 2002/03/11 11 22:38:53 EST
562    Tag: george-20020311-jump-chain-elim
563    Description:
564    
565    Tested the jump chain elimination on all architectures (except the
566    hppa).  This is on by default right now and is profitable for the
567    alpha and x86, however, it may not be profitable for the sparc and ppc
568    when compiling the compiler.
569    
570    The gc test will typically jump to a label at the end of the cluster,
571    where there is another jump to an external cluster containing the actual
572    code to invoke gc. This is to allow factoring of common gc invocation
573    sequences. That is to say, we generate:
574    
575            f:
576               testgc
577               ja   L1      % jump if above to L1
578    
579            L1:
580               jmp L2
581    
582    
583    After jump chain elimination the 'ja L1' instructions is converted to
584    'ja L2'. On the sparc and ppc, many of the 'ja L2' instructions may end
585    up being implemented in their long form (if L2 is far away) using:
586    
587            jbe     L3      % jump if below or equal to L3
588            jmp     L2
589         L3:
590            ...
591    
592    
593    For large compilation units L2  may be far away.
594    
595    
596    ----------------------------------------------------------------------
597    Name: Matthias Blume
598    Date: 2002/03/11 13:30:00 EST
599    Tag: blume-20020311-mltreeeval
600    Description:
601    
602    A functor parameter was missing.
603    
604    ----------------------------------------------------------------------
605    Name: Allen Leung
606    Date: 2002/03/11 10:30:00 EST
607    Tag: leunga-20020311-runtime-string0
608    Description:
609    
610       The representation of the empty string now points to a
611    legal null terminated C string instead of unit.  It is now possible
612    to convert an ML string into C string with InlineT.CharVector.getData.
613    This compiles into one single machine instruction.
614    
615    ----------------------------------------------------------------------
616  Name: Allen Leung  Name: Allen Leung
617  Date: 2002/03/10 23:55:00 EST  Date: 2002/03/10 23:55:00 EST
618  Tag: leunga-20020310-x86-call  Tag: leunga-20020310-x86-call

Legend:
Removed from v.1130  
changed lines
  Added in v.1185

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0