Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Diff of /sml/trunk/HISTORY
ViewVC logotype

Diff of /sml/trunk/HISTORY

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1148, Fri Mar 15 21:38:57 2002 UTC revision 1185, Mon Apr 1 22:06:47 2002 UTC
# Line 14  Line 14 
14    
15  ----------------------------------------------------------------------  ----------------------------------------------------------------------
16  Name: Matthias Blume  Name: Matthias Blume
17    Date: 2002/04/01 (no joke!) 17:07:00 EST
18    Tag: blume-20020401-x86div
19    Description:
20    
21    Added full support for div/mod/rem/quot on the x86, using the machine
22    instruction's two results (without clumsily recomputing the remainder)
23    directly where appropriate.
24    
25    Some more extensive power-of-two support was added to the x86 instruction
26    selector (avoiding expensive divs, mods, and muls where they can be
27    replaced with cheaper shifts and masks).  However, this sort of thing
28    ought to be done earlier, e.g., within the CPS optimizer so that
29    all architectures benefit from it.
30    
31    The compiler compiles to a fixed point, but changes might be somewhat
32    fragile nevertheless.  Please, report any strange things that you might
33    see wrt. div/mod/quot/rem...
34    
35    ----------------------------------------------------------------------
36    Name: Matthias Blume
37    Date: 2002/03/29 17:22:00
38    Tag: blume-20020329-div
39    Description:
40    
41    Fixed my broken div/mod logic.  Unfortunately, this means that the
42    inline code for div/mod now has one more comparison than before.
43    Fast paths (quotient > 0 or remainder = 0) are not affected, though.
44    The problem was with quotient = 0, because that alone does not tell
45    us which way the rounding went.  One then has to look at whether
46    remainder and divisor have the same sign...  :(
47    
48    Anyway, I replaced the bootfiles with fresh ones...
49    
50    ----------------------------------------------------------------------
51    Name: Matthias Blume
52    Date: 2002/03/29 14:10:00 EST
53    Tag: blume-20020329-inlprims
54    Description:
55    
56    NEW BOOTFILES!!!    Version number bumped to 110.39.3.
57    
58    Primops have changed. This means that the bin/boot-file formats have
59    changed as well.
60    
61    To make sure that there is no confusion, I made a new version.
62    
63    
64    CHANGES:
65    
66    * removed REMT from mltree (remainder should never overflow).
67    
68    * added primops to deal with divisions of all flavors to the frontend
69    
70    * handled these primops all the way through so they map to their respective
71      MLRISC support
72    
73    * used these primops in the implementation of Int, Int32, Word, Word32
74    
75    * removed INLDIV, INLMOD, and INLREM as they are no longer necessary
76    
77    * parameterized INLMIN, INLMAX, and INLABS by a numkind
78    
79    * translate.sml now deals with all flavors of INL{MIN,MAX,ABS}, including
80      floating point
81    
82    * used INL{MIN,MAX,ABS} in the implementation of Int, Int32, Word, Word32,
83      and Real (but Real.abs maps to a separate floating-point-only primop)
84    
85    
86    TODO items:
87    
88    * Hacked Alpha32 instruction selection, disabling the selection of REMx
89      instructions because the machine instruction encoder cannot handle
90      them.  (Hppa, PPC, and Sparc instruction selection did not handle
91      REM in the first place, and REM is supported by the x86 machine coder.)
92    
93    * Handle DIV and MOD with DIV_TO_NEGINF directly in the x86 instruction
94      selection phase.  (The two can be streamlined because the hardware
95      delivers both quotient and remainder at the same time anyway.)
96    
97    * Think about what to do with "valOf(Int32.minInt) div ~1" and friends.
98      (Currently the behavior is inconsistent both across architectures and
99      wrt. the draft Basis spec.)
100    
101    * Word8 should eventually be handled natively, too.
102    
103    * There seems to be one serious bug in mltree-gen.sml.  It appears, though,
104      as if there currently is no execution path that could trigger it in
105      SML/NJ.  (The assumptions underlying functions arith and promotable do not
106      hold for things like multiplication and division.)
107    
108    ----------------------------------------------------------------------
109    Name: Matthias Blume
110    Date: 2002/03/27 16:27:00 EST
111    Tag: blume-20020327-mlrisc-divisions
112    Description:
113    
114    Added support for all four division operations (ML's div, mod, quot,
115    and rem) to MLRISC.  In the course of doing so, I also rationalized
116    the naming (no more annoying switch-around of DIV and QUOT), by
117    parameterizing the operation by div_rounding_mode (which can be either
118    DIV_TO_ZERO or DIV_TO_NEGINF).
119    
120    The generic MLTreeGen functor takes care of compiling all four
121    operations down to only round-to-zero div.
122    
123    Missing pieces:
124    
125      * Doing something smarter than relying on MLTreeGen on architectures
126        like, e.g., the x86 where hardware division delivers both quotient and
127        remainder at the same time.  With this, the implementation of the
128        round-to-neginf operations could be further streamlined.
129    
130      * Remove inlining support for div/mod/rem from the frontend and replace it
131        with primops that get carried through to the backend.  Do this for all
132        int and word types.
133    
134    ----------------------------------------------------------------------
135    Name: Matthias Blume
136    Date: 2002/03/25 17:25:00 EST
137    Tag: blume-20020325-divmod
138    Description:
139    
140    I improved (hopefully without breaking them) the implementation of Int.div,
141    Int.mod, and Int.rem.   For this, the code in translate.sml now takes
142    advantage of the following observations:
143    
144      Let  q = x quot y      r = x rem y
145           d = x div  y      m = x mod y
146    
147    where "quot" is the round-to-zero version of integer division that
148    hardware usually provides.  Then we have:
149    
150         r = x - q * y        where neither the * nor the - will overflow
151         d = if q >= 0 orelse x = q * y then q else q - 1
152                              where neither the * nor the - will overflow
153         m = if q >= 0 orelse r = 0 then r else r + y
154                              where the + will not overflow
155    
156    This results in substantial simplification of the generated code.
157    The following table shows the number of CFG nodes and edges generated
158    for
159            fun f (x, y) = x OPER y
160            (* with OPER \in div, mod, quot, rem *)
161    
162    
163        OPER | nodes(old) | edges(old) | nodes(new) | edges(new)
164        --------------------------------------------------------
165         div |         24 |         39 |         12 |         16
166         mod |         41 |         71 |         12 |         16
167        quot |          8 |         10 |          8 |         10
168         rem |         10 |         14 |          8 |         10
169    
170    
171    ----------------------------------------------------------------------
172    Name: Matthias Blume
173    Date: 2002/03/25 22:06:00 EST
174    Tag: blume-20020325-cprotobug
175    Description:
176    
177    Fixed a bug in cproto (c prototype decoder).
178    
179    ----------------------------------------------------------------------
180    Name: Matthias Blume
181    Date: 2002/03/25 16:00:00 EST
182    Tag: blume-20020325-raw-primops
183    Description:
184    
185    I did some cleanup to Allen's new primop code and
186    replaced yesterday's bootfiles with new ones.
187    (But they are stored in the same place.)
188    
189    ----------------------------------------------------------------------
190    Name: Matthias Blume
191    Date: 2002/03/24 22:40:00 EST
192    Tag: blume-20020324-bootfiles
193    Description:
194    
195    Made the bootfiles that Allen asked for.
196    
197    ----------------------------------------------------------------------
198    Name: Allen Leung
199    Date: 2002/03/23 15:50:00 EST
200    Tag: leunga-20020323-flint-cps-rcc-primops
201    Description:
202    
203      1. Changes to FLINT primops:
204    
205        (* make a call to a C-function;
206         * The primop carries C function prototype information and specifies
207         * which of its (ML-) arguments are floating point. C prototype
208         * information is for use by the backend, ML information is for
209         * use by the CPS converter. *)
210      | RAW_CCALL of { c_proto: CTypes.c_proto,
211                       ml_args: ccall_type list,
212                       ml_res_opt: ccall_type option,
213                       reentrant : bool
214                     } option
215       (* Allocate uninitialized storage on the heap.
216        * The record is meant to hold short-lived C objects, i.e., they
217        * are not ML pointers.  With the tag, the representation is
218        * the same as RECORD with tag tag_raw32 (sz=4), or tag_fblock (sz=8)
219        *)
220      | RAW_RECORD of {tag:bool,sz:int}
221      and ccall_type = CCALL_INT32 | CCALL_REAL64 | CCALL_ML_PTR
222    
223      2.  These CPS primops are now overloaded:
224    
225           rawload of {kind:numkind}
226           rawstore of {kind:numkind}
227    
228          The one argument form is:
229    
230             rawload {kind} address
231    
232          The two argument form is:
233    
234             rawload {kind} [ml object, byte-offset]
235    
236      3. RAW_CCALL/RCC now takes two extra arguments:
237    
238         a. The first is whether the C call is reentrant, i.e., whether
239            ML state should be saved and restored.
240         b. The second argument is a string argument specifying the name of
241            library and the C function.
242    
243         These things are currently not handled in the code generator, yet.
244    
245      4. In CProto,
246    
247         An encoding type of "bool" means "ml object" and is mapped into
248         C prototype of PTR.  Note that "bool" is different than "string",
249         even though "string" is also mapped into PTR, because "bool"
250         is assigned an CPS type of BOGt, while "string" is assigned INT32t.
251    
252      5. Pickler/unpicker
253    
254         Changed to handle RAW_RECORD and newest RAW_CCALL
255    
256      6. MLRiscGen,
257    
258         1. Changed to handle the new rawload/rawstore/rawrecord operators.
259         2. Code for handling C Calls has been moved to a new module CPSCCalls,
260            in the file CodeGen/cpscompile/cps-c-calls.sml
261    
262      7. Added the conditional move operator
263    
264             condmove of branch
265    
266         to cps.  Generation of this is still buggy so it is currently
267         disabled.
268    
269    ----------------------------------------------------------------------
270    Name: Lal George
271    Date: 2002/03/22 14:18:25 EST
272    Tag: george-20020322-cps-branch-prob
273    Description:
274    
275    Implemented the Ball-Larus branch prediction-heuristics, and
276    incorporated graphical viewers for control flow graphs.
277    
278    Ball-Larus Heuristics:
279    ---------------------
280    See the file compiler/CodeGen/cpscompile/cpsBranchProb.sml.
281    
282    By design it uses the Dempster-Shafer theory for combining
283    probabilities.  For example, in the function:
284    
285        fun f(n,acc) = if n = 0 then acc else f(n-1, n*acc)
286    
287    the ball-larus heuristics predicts that the n=0 is unlikely
288    (OH-heuristic), and the 'then' branch is unlikely because of the
289    RH-heuristic -- giving the 'then' branch an even lower combined
290    probability using the Dempster-Shafer theory.
291    
292    Finally, John Reppy's loop analysis in MLRISC, further lowers the
293    probability of the 'then' branch because of the loop in the else
294    branch.
295    
296    
297    Graphical Viewing:
298    ------------------
299    I merely plugged in Allen's graphical viewers into the compiler. The
300    additional code is not much. At the top level, saying:
301    
302            Control.MLRISC.getFlag "cfg-graphical-view" := true;
303    
304    will display the graphical view of the control flow graph just before
305    back-patching.  daVinci must be in your path for this to work. If
306    daVinci is not available, then the default viewer can be changed
307    using:
308    
309            Control.MLRISC.getString "viewer"
310    
311    which can be set to "dot" or "vcg" for the corresponding viewers. Of
312    course, these viewers must be in your path.
313    
314    The above will display the compilation unit at the level of clusters,
315    many of which are small, boring, and un-interesting. Also setting:
316    
317            Control.MLRISC.getInt "cfg-graphical-view_size"
318    
319    will display clusters that are larger than the value set by the above.
320    
321    
322    ----------------------------------------------------------------------
323    Name: Matthias Blume
324    Date: 2002/03/21 22:20:00 EST
325    Tag: blume-20020321-kmp-bugfix
326    Description:
327    
328    Changed the interface to the KMP routine in PreString and fixed
329    a minor bug in one place where it was used.
330    
331    ----------------------------------------------------------------------
332    Name: Allen Leung
333    Date: 2002/03/21 20:30:00 EST
334    Tag: leunga-20020321-cfg
335    Description:
336    
337      Fixed a potential problem in cfg edge splitting.
338    
339    ----------------------------------------------------------------------
340    Name: Allen Leung
341    Date: 2002/03/21 17:15:00 EST
342    Tag: leunga-20020321-x86-fp-cfg
343    Description:
344    
345      1. Recoded the buggy parts of x86-fp.
346    
347         a. All the block reordering code has been removed.
348            We now depend on the block placement phases to do this work.
349    
350         b. Critical edge splitting code has been simplified and moved into the
351            CFG modules, as where they belong.
352    
353         Both of these were quite buggy and complex.  The code is now much, much
354         simpler.
355    
356      2. X86 backend.
357    
358         a. Added instructions for 64-bit support.  Instruction selection for
359            64-bit has not been committed, however, since that
360            requires changes to MLTREE which haven't been approved by
361            Lal and John.
362    
363         b. Added support for FUCOMI and FUCOMIP when generating code for
364            PentiumPro and above.  We only generate these instructions in
365            the fast-fp mode.
366    
367         c. Added cases for JP and JNP in X86FreqProps.
368    
369      3. CFG
370    
371         CFG now has a bunch of methods for edge splitting and merging.
372    
373      4. Machine description.
374    
375         John's simplification of MLTREE_BASIS.fcond broke a few machine
376         description things:
377    
378         rtl-build.{sig,sml} and hppa.mdl fixed.
379    
380         NOTE: the machine description stuff in the repository is still broken.
381               Again, I can't put my fixes in because that involves
382               changes to MLTREE.
383    
384    ----------------------------------------------------------------------
385    Name: Matthias Blume
386    Date: 2002/03/20 15:55:00 EST
387    Tag: blume-20020320-kmp
388    Description:
389    
390    Implemented Knuth-Morris-Pratt string matching in PreString and used
391    it for String.isSubstring, Substring.isSubstring, and
392    Substring.position.
393    
394    (Might need some stress-testing.  Simple examples worked fine.)
395    
396    ----------------------------------------------------------------------
397    Name: Matthias Blume
398    Date: 2002/03/19 16:37:00 EST
399    Tag: blume-20020319-witnesses
400    Description:
401    
402    Added a structure C.W and functions convert/Ptr.convert to ml-nlffi-lib.
403    
404    This implements a generic mechanism for changing constness qualifiers
405    anywhere within big C types without resorting to outright "casts".
406    (So far, functions such as C.rw/C.ro or C.Ptr.rw/C.Ptr.ro only let you
407    modify the constness at the outermost level.)
408    The implementation of "convert" is based on the idea of "witness"
409    values -- values that are not used by the operation but whose types
410    "testify" to their applicability.  On the implementation side, "convert"
411    is simply a projection (returning its second curried argument).  With
412    cross-module inlining, it should not result in any machine code being
413    generated.
414    
415    ----------------------------------------------------------------------
416    Name: Matthias Blume
417  Date: 2002/03/15 16:40:00 EST  Date: 2002/03/15 16:40:00 EST
418  Tag: blume-20020315-basis  Tag: blume-20020315-basis
419  Description:  Description:

Legend:
Removed from v.1148  
changed lines
  Added in v.1185

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0