SCM Repository
Annotation of /sml/trunk/HISTORY
Parent Directory
|
Revision Log
Revision 633 - (view) (download)
1 : | dbm | 570 | This is the HISTORY file for the Yale SML/NJ CVS repository. |
2 : | |||
3 : | An entry should be made for _every_ commit to the repository. | ||
4 : | The entries in this file will be used when creating the README | ||
5 : | for new versions, so keep that in mind when writing the | ||
6 : | description. | ||
7 : | |||
8 : | The form of an entry should be: | ||
9 : | |||
10 : | Name: | ||
11 : | Date: | ||
12 : | Tag: <post-commit CVS tag> | ||
13 : | Description: | ||
14 : | leunga | 591 | ---------------------------------------------------------------------- |
15 : | dbm | 633 | Name: Dave MacQueen |
16 : | Date: 2000/04/30 12:40PM EDT | ||
17 : | Tag: dbm-20000430-bug_fixes | ||
18 : | Description: | ||
19 : | |||
20 : | 1. Fix for bug 1498 | ||
21 : | smlnj/src/system/Basis/Implementation/Unsafe/object.sig | ||
22 : | smlnj/src/system/Basis/Implementation/Unsafe/object.sml | ||
23 : | added toRealArray function | ||
24 : | smlnj/src/compiler/MiscUtil/print/ppobj.sml | ||
25 : | added check for tag Obj.RealArray to array printing case in ppObj | ||
26 : | |||
27 : | 2. Fix for bug 1510 | ||
28 : | smlnj/src/compiler/Semant/types/typesutil.sml | ||
29 : | fixed definition of dummyargs (used by equalTycon) so that | ||
30 : | dummy args are distinct types | ||
31 : | |||
32 : | ---------------------------------------------------------------------- | ||
33 : | blume | 629 | Name: Matthias Blume |
34 : | blume | 632 | Date: 2000/04/30 01:00:00 JST |
35 : | Tag: blume-20000430-versions | ||
36 : | Description: | ||
37 : | |||
38 : | 1. CM version numbering added. This is an implementation of Lal's | ||
39 : | proposal for adding version numbers and version checking to .cm | ||
40 : | files. Lal said that his proposal was just that -- a proposal. | ||
41 : | For the time being I went ahead and implemented it so that people | ||
42 : | can comment on it. Everything is completely backward-compatible | ||
43 : | (except for the stable library format, i.e., new bootfiles!). | ||
44 : | |||
45 : | As usual, see the CM manual for details. | ||
46 : | |||
47 : | 2. An alternative syntax for anchored paths has been implemented. | ||
48 : | Dave has recently voiced the same concerns that I had when I did | ||
49 : | this, so there should be some support. My take is that eventually | ||
50 : | I will let support for the current syntax (where anchors are | ||
51 : | "implicit") fade out in favor of the new, explicit syntax. | ||
52 : | In order to be backward-compatible, both old and new syntax are | ||
53 : | currently supported. | ||
54 : | |||
55 : | Again, see the CM manual for details. | ||
56 : | |||
57 : | 3. Parallel make is trying to be slightly smarter: When the master | ||
58 : | process finds a "bottleneck", i.e., when there is only one | ||
59 : | compilation unit that can be compiled and everybody else is | ||
60 : | waiting on it, then it will simply compile it directly instead | ||
61 : | of clumsily telling one of the slaves to do it. | ||
62 : | |||
63 : | 4. Support for "unsharing" added. This is necessary in order to be | ||
64 : | able to have two different versions of the same library running | ||
65 : | at the same time (e.g., for trying out a new MLRISC while still | ||
66 : | having the old MLRISC linked into the current compiler, etc.) | ||
67 : | See the CM manual. | ||
68 : | |||
69 : | 5. Simple "makedepend" functionality added for generating Makefile | ||
70 : | dependency information. (This is rather crude at the moment. | ||
71 : | Expect some changes here in the future.) | ||
72 : | |||
73 : | 6. ".fun" added as a recognized suffix for ML files. Also documented | ||
74 : | explicitly in the manual that the fallback behavior (unknown suffix | ||
75 : | -> ML file) is not an official feature! | ||
76 : | |||
77 : | 7. Small changes to the pickler for stable libraries. | ||
78 : | |||
79 : | 8. Several internal changes to CM (for cleanup/improvement). | ||
80 : | |||
81 : | |||
82 : | !!!! NEW BINFILES !!!! | ||
83 : | |||
84 : | ---------------------------------------------------------------------- | ||
85 : | Name: Matthias Blume | ||
86 : | blume | 631 | Date: 2000/04/28 17:30:00 JST |
87 : | Tag: blume-20000428-pathconfig | ||
88 : | Description: | ||
89 : | |||
90 : | 1. I changed config/install.sh to remove duplicate entries from the | ||
91 : | lib/pathconfig file at the end. Moreover, the final version of | ||
92 : | lib/pathconfig is sorted alphabetically. The same (sorting) is done | ||
93 : | in src/system/installml. | ||
94 : | |||
95 : | 2. The config/install.sh script now consistently uses relative | ||
96 : | pathnames in lib/pathconfig whenever the anchor is in the lib | ||
97 : | directory. (So far this was true for the libraries that come | ||
98 : | pre-compiled and bundled as part of the bootfiles but not for | ||
99 : | libraries that are compiled by the script itself.) | ||
100 : | |||
101 : | ---------------------------------------------------------------------- | ||
102 : | Name: Matthias Blume | ||
103 : | blume | 629 | Date: 2000/04/26 13:10:00 JST |
104 : | Tag: blume-20000426-fun_suffix | ||
105 : | Description: | ||
106 : | |||
107 : | Added ".fun" as a recognized file name suffix (for ML code). | ||
108 : | |||
109 : | ---------------------------------------------------------------------- | ||
110 : | leunga | 628 | Name: Allen Leung |
111 : | Date: 2000/04/25 17:00:00 EST | ||
112 : | Tag: leunga-20000425-alpha-ra | ||
113 : | Description: | ||
114 : | |||
115 : | 1. Alpha | ||
116 : | |||
117 : | PSEUDOARITH was missing in AlphaRewrite. This causes an endless loop | ||
118 : | in C--. | ||
119 : | |||
120 : | 2. RA | ||
121 : | |||
122 : | Added a flag "ra-dump-size" to print out the size of the flowgraph | ||
123 : | and the interference graph. | ||
124 : | |||
125 : | ---------------------------------------------------------------------- | ||
126 : | dbm | 627 | Name: Dave MacQueen |
127 : | Date: 2000/04/25/ | ||
128 : | Tag: dbm-20000425-mlyacc_doc_examples | ||
129 : | Description: | ||
130 : | Updated mlyacc.tex sections 5 and 7 for SML '97 and CM. | ||
131 : | Updated all three examples in src/ml-yacc/examples to run | ||
132 : | under 110.* using CM.make. | ||
133 : | |||
134 : | ---------------------------------------------------------------------- | ||
135 : | leunga | 624 | Name: Allen Leung |
136 : | Date: 2000/04/20 23:04:00 EST | ||
137 : | Tag: leunga-20000420-ssa-c---stuff | ||
138 : | Description: | ||
139 : | |||
140 : | This update synchronizes my repository with Yale's. Most of these | ||
141 : | changes, however, do not affect SML/NJ at all (the RA is an exception). | ||
142 : | |||
143 : | 1. Register Allocator | ||
144 : | |||
145 : | a. An improvement in the interference graph construction: | ||
146 : | Given a copy | ||
147 : | |||
148 : | s <- t | ||
149 : | |||
150 : | leunga | 628 | no interference edge between s and t is added for this definition of s. |
151 : | leunga | 624 | |
152 : | b. I've added two new spill heuristic modules that Fermin and I developed | ||
153 : | (in the new library RA.cm). These are unused in SML/NJ but maybe | ||
154 : | useful for others (Moby?) | ||
155 : | |||
156 : | 2. X86 | ||
157 : | |||
158 : | a. Various fixes in the backend provided by Fermin [C--] and Lal. | ||
159 : | |||
160 : | 3. Alpha | ||
161 : | |||
162 : | a. Added the BSR instruction and code generation that goes with it [C--] | ||
163 : | b. Other fixes too numerous to recount provided by Fermin [C--] | ||
164 : | |||
165 : | 4. Regmaps | ||
166 : | |||
167 : | a. The regmaps are not initialized with the identity physical bindings | ||
168 : | at creation time. This is unneeded. | ||
169 : | |||
170 : | 5. MLRISC Optimizations | ||
171 : | |||
172 : | a. The DJ-Graph module can now compute the iterated dominance frontiers | ||
173 : | intersects with liveness incrementally in linear time! Woohoo! | ||
174 : | This is now used in my new SSA construction algorithm. | ||
175 : | |||
176 : | b. THe branch reorganization module is now smarter about linear chains of | ||
177 : | basic blocks. | ||
178 : | |||
179 : | |||
180 : | ---------------------------------------------------------------------- | ||
181 : | blume | 615 | Name: Matthias Blume |
182 : | Date: 2000/04/12 13:52:00 JST | ||
183 : | Tag: blume_main_v110p27_1 | ||
184 : | Description: | ||
185 : | |||
186 : | Changed install.sh script to handle archive files without version number | ||
187 : | and to use "boot.<arch>-<os>" instead of "sml.boot.<arch>-<os>" for the | ||
188 : | name of the boot file archive. | ||
189 : | |||
190 : | ---------------------------------------------------------------------- | ||
191 : | dbm | 607 | Name: Dave MacQueen |
192 : | Date: 2000/04/09 14:00 EDT | ||
193 : | Tag: dbm-20000410-Version_110_27 | ||
194 : | Description: | ||
195 : | |||
196 : | 1. Updated src/compiler/TopLevel/main/version.sml to version 110.27 | ||
197 : | |||
198 : | 2. Updated src/config/version to 110.27 | ||
199 : | |||
200 : | 3. New boot files! | ||
201 : | |||
202 : | ---------------------------------------------------------------------- | ||
203 : | leunga | 606 | Name: Allen Leung |
204 : | Date: 2000/04/09 19:09:00 EST | ||
205 : | Tag: leunga-20000409-misc | ||
206 : | Description: | ||
207 : | |||
208 : | 1. Yet another fix for x86 assembly for idivl, imull, mull and friends. | ||
209 : | |||
210 : | 2. Miscellaneous improvements to MLRISC (unused in sml/nj) | ||
211 : | |||
212 : | ---------------------------------------------------------------------- | ||
213 : | monnier | 605 | Name: Stefan |
214 : | Date: 2000/04/07 10:00:00 EDT | ||
215 : | Tag: monnier-20000406-branch-handling | ||
216 : | Description: | ||
217 : | |||
218 : | Improved handling of branches (mostly those generated from | ||
219 : | polymorphic equality), removed switchoff and changed the | ||
220 : | default optimization settings (more cpsopt and less flintopt). | ||
221 : | |||
222 : | ---------------------------------------------------------------------- | ||
223 : | leunga | 601 | Name: Allen Leung |
224 : | leunga | 602 | Date: 2000/04/06 01:30:00 EST |
225 : | Tag: leunga-20000406-peephole-x86-SSA-2 | ||
226 : | Description: | ||
227 : | |||
228 : | Forgot a few files. | ||
229 : | |||
230 : | ---------------------------------------------------------------------- | ||
231 : | Name: Allen Leung | ||
232 : | leunga | 601 | Date: 2000/04/06 00:36:00 EST |
233 : | Tag: leunga-20000406-peephole-x86-SSA | ||
234 : | Description: | ||
235 : | |||
236 : | 1. New Peephole code | ||
237 : | |||
238 : | 2. Minor improvement to X86 instruction selection | ||
239 : | |||
240 : | 3. Various fixes to SSA and machine description -> code translator | ||
241 : | |||
242 : | ---------------------------------------------------------------------- | ||
243 : | blume | 595 | Name: Matthias Blume |
244 : | Date: 2000/04/05 12:30:00 JST | ||
245 : | Tag: blume_main_v110p26p2_3 | ||
246 : | Description: | ||
247 : | |||
248 : | This update just merges three minor cosmetic updates to CM's sources | ||
249 : | to get ready for the 110.27 code freeze on Friday. No functionality | ||
250 : | has changed. | ||
251 : | |||
252 : | ---------------------------------------------------------------------- | ||
253 : | leunga | 593 | Name: Allen Leung |
254 : | leunga | 594 | Date: 2000/04/04 19:39:00 EST |
255 : | Tag: leunga-20000404-x86-asm | ||
256 : | Description: | ||
257 : | |||
258 : | 1. Fixed a problem in X86 assembly. | ||
259 : | |||
260 : | Things like | ||
261 : | |||
262 : | jmp %eax | ||
263 : | jmp (%eax) | ||
264 : | |||
265 : | should be output as | ||
266 : | |||
267 : | jmp *%eax | ||
268 : | jmp *(%eax) | ||
269 : | |||
270 : | 2. Assembly output | ||
271 : | |||
272 : | Added a new flag | ||
273 : | |||
274 : | "asm-indent-copies" (default to false) | ||
275 : | |||
276 : | When this flag is on, parallel copies will be indented an extra level. | ||
277 : | |||
278 : | ---------------------------------------------------------------------- | ||
279 : | Name: Allen Leung | ||
280 : | leunga | 593 | Date: 2000/04/04 03:18:00 EST |
281 : | Tag: leunga-20000404-C--Moby | ||
282 : | Description: | ||
283 : | |||
284 : | All of these fixes are related to C--, Moby, and my own optimization | ||
285 : | stuff; so they shouldn't affect SML/NJ. | ||
286 : | |||
287 : | 1. X86 | ||
288 : | |||
289 : | Various fixes related floating point, and extensions. | ||
290 : | |||
291 : | 2. Alpha | ||
292 : | |||
293 : | Some extra patterns related to loads with signed/zero extension | ||
294 : | provided by Fermin. | ||
295 : | |||
296 : | 3. Assembly | ||
297 : | |||
298 : | When generating assemby, resolve the value of client defined constants, | ||
299 : | instead of generating symbolic values. This is controlled by the | ||
300 : | new flag "asm-resolve-constants", which is default to true. | ||
301 : | |||
302 : | 4. Machine Descriptions | ||
303 : | |||
304 : | a. The precedence parser was slightly broken when parsing infixr symbols. | ||
305 : | b. The type generalizing code had the bound variables reversed, resulting | ||
306 : | in a problem during arity raising. | ||
307 : | c. Various fixes in machine descriptions. | ||
308 : | |||
309 : | ---------------------------------------------------------------------- | ||
310 : | blume | 592 | Name: Matthias Blume |
311 : | Date: 2000/04/03 16:05:00 JST | ||
312 : | Tag: blume_main_v110p26p2_2 | ||
313 : | Description: | ||
314 : | |||
315 : | I eliminated coreEnv from compInfo. Access to the "Core" structure is | ||
316 : | now done via the ordinary static environment that is context to each | ||
317 : | compilation unit. | ||
318 : | |||
319 : | To this end, I arranged that instead of "structure Core" as "structure | ||
320 : | _Core" is bound in the pervasive environment. Core access is done via | ||
321 : | _Core (which can never be accidentially rebound because _Core is not a | ||
322 : | legal surface-syntax symbol). | ||
323 : | |||
324 : | The current solution is much cleaner because the core environment is | ||
325 : | now simply part of the pervasive environment which is part of every | ||
326 : | compilation unit's context anyway. In particular, this eliminates all | ||
327 : | special-case handling that was necessary until now in order to deal | ||
328 : | with dynamic and symbolic parts of the core environment. | ||
329 : | |||
330 : | Remaining hackery (to bind the "magic" symbol _Core) is localized in the | ||
331 : | compilation mananger's bootstrap compiler (actually: in the "init group" | ||
332 : | handling). See the comments in src/system/smlnj/init/init.cmi for | ||
333 : | more details. | ||
334 : | |||
335 : | I also tried to track down all mentions of "Core" (as string argument | ||
336 : | to Symbol.strSymbol) in the compiler and replaced them with a | ||
337 : | reference to the new CoreSym.coreSym. Seems cleaner since the actual | ||
338 : | name appears in one place only. | ||
339 : | |||
340 : | Binfile and bootfile format have not changed, but the switchover from | ||
341 : | the old "init.cmi" to the new one is a bit tricky, so I supplied new | ||
342 : | bootfiles anyway. | ||
343 : | |||
344 : | ---------------------------------------------------------------------- | ||
345 : | leunga | 591 | Name: Allen Leung |
346 : | Date: 2000/04/02 21:17:00 EST | ||
347 : | Tag: leunga-20000402-mltree | ||
348 : | Description: | ||
349 : | leunga | 585 | |
350 : | leunga | 591 | 1. Renamed the constructor CALL in MLTREE by popular demand. |
351 : | 2. Added a bunch of files from my repository. These are currently | ||
352 : | used by other non-SMLNJ backends. | ||
353 : | |||
354 : | leunga | 576 | ---------------------------------------------------------------------- |
355 : | leunga | 590 | Name: Allen Leung |
356 : | Date: 2000/03/31 21:15:00 EST | ||
357 : | Tag: leunga-20000331-aliasing | ||
358 : | Description: | ||
359 : | |||
360 : | This update contains a rewritten (and hopefully more correct) module | ||
361 : | for extracting aliasing information from CPS. | ||
362 : | |||
363 : | To turn on this feature: | ||
364 : | |||
365 : | Compiler.Control.CG.memDisambiguate := true | ||
366 : | |||
367 : | To pretty print the region information with assembly | ||
368 : | |||
369 : | Compiler.Control.MLRISC.getFlag "asm-show-region" := true; | ||
370 : | |||
371 : | To control how many levels of aliasing information are printed, use: | ||
372 : | |||
373 : | Compiler.Control.MLRISC.getInt "points-to-show-level" := n | ||
374 : | |||
375 : | The default of n is 3. | ||
376 : | |||
377 : | ---------------------------------------------------------------------- | ||
378 : | dbm | 589 | Name: David MacQueen |
379 : | Date: 2000/03/31 11:15:00 EST | ||
380 : | Tag: dbm-20000331-runtime_fix | ||
381 : | Description: | ||
382 : | |||
383 : | This update contains: | ||
384 : | |||
385 : | 1. runtime/c-lib/c-libraries.c | ||
386 : | includes added in revision 1.2 caused compilation errors on hppa-hpux | ||
387 : | |||
388 : | 2. fix for bug 1556 | ||
389 : | system/Basis/Implementation/NJ/internal-signals.sml | ||
390 : | |||
391 : | ---------------------------------------------------------------------- | ||
392 : | blume | 587 | Name: Matthias Blume |
393 : | blume | 588 | Date: 2000/03/31 18:00:00 JST |
394 : | Tag: blume_main_v110p26p2_1 | ||
395 : | Description: | ||
396 : | |||
397 : | This update contains: | ||
398 : | |||
399 : | 1. A small change to CM's handling of stable libraries: | ||
400 : | CM now maintains one "global" modmap that is used for all stable | ||
401 : | libraries. The use of such a global modmap maximizes sharing and | ||
402 : | minimizes the need for re-traversing parts of environments during | ||
403 : | modmap construction. (However, this has minor impact since modmap | ||
404 : | construction seems to account for just one percent or less of total | ||
405 : | compile time.) | ||
406 : | |||
407 : | 2. I added a "genmap" phase to the statistics. This is where I got the | ||
408 : | "one percent" number (see above). | ||
409 : | |||
410 : | 3. CM's new tool parameter mechanism just became _even_ better. :) | ||
411 : | - The parser understands named parameters and recursive options. | ||
412 : | - The "make" and "shell" tools use these new features. | ||
413 : | (This makes it a lot easier to cascade these tools.) | ||
414 : | - There is a small syntax change: named parameters use a | ||
415 : | |||
416 : | <name> : ( <option> ... ) or | ||
417 : | <name> : <string> | ||
418 : | |||
419 : | syntax. Previously, named parameters were implemented in an | ||
420 : | ad-hoc fashion by each tool individually (by parsing strings) | ||
421 : | and had the form | ||
422 : | |||
423 : | <name>=<string> | ||
424 : | |||
425 : | See the CM manual for a full description of these issues. | ||
426 : | |||
427 : | ---------------------------------------------------------------------- | ||
428 : | Name: Matthias Blume | ||
429 : | blume | 587 | Date: 2000/03/30 18:00:00 JST |
430 : | Tag: blume_main_v110p26p2_0 | ||
431 : | Description: | ||
432 : | |||
433 : | !!!!! WARNING !!!!!! | ||
434 : | !! New binfiles !! | ||
435 : | !!!!!!!!!!!!!!!!!!!! | ||
436 : | |||
437 : | This update contains: | ||
438 : | |||
439 : | 1. Moderate changes to CM: | ||
440 : | |||
441 : | - Changes to CM's tools mechanism. In particular, it is now possible | ||
442 : | to have tools that accept additional "command line" parameters | ||
443 : | (specified in the .cm file at each instance where the tool's class is | ||
444 : | used). | ||
445 : | |||
446 : | This was done to accomodate the new "make" and "shell" tools which | ||
447 : | facilitate fairly seemless hookup to portions of code managed using | ||
448 : | Makefiles or Shell scripts. | ||
449 : | |||
450 : | There are no classes "shared" or "private" anymore. Instead, the | ||
451 : | sharing annotation is now a parameter to the "sml" class. | ||
452 : | |||
453 : | There is a bit of generic machinery for implementing one's own | ||
454 : | tools that accept command-line parameters. However, I am not yet fully | ||
455 : | satisfied with that part, so expect changes here in the future. | ||
456 : | |||
457 : | All existing tools are described in the CM manual. | ||
458 : | |||
459 : | - Slightly better error handling. (CM now surpresses many followup | ||
460 : | error messages that tended to be more annoying than helpful.) | ||
461 : | |||
462 : | 2. Major changes to the compiler's static environment data structures. | ||
463 : | |||
464 : | - no CMStaticEnv anymore. | ||
465 : | - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment, | ||
466 : | but it is called Environment), no conversions between different | ||
467 : | kinds of static environments | ||
468 : | |||
469 : | - There is still a notion of a "modmap", but such modmaps are generated | ||
470 : | on demand at the time when they are needed. This sounds slow, but I | ||
471 : | sped up the code that generates modmaps enough for this not to lead to | ||
472 : | a slowdown of the compiler (at least I didn't detect any). | ||
473 : | |||
474 : | - To facilitate rapid modmap generation, static environments now | ||
475 : | contain an (optional) "modtree" structure. Modtree annotations are | ||
476 : | constructed by the unpickler during unpickling. (This means that | ||
477 : | the elaborator does not have to worry about modtrees at all.) | ||
478 : | Modtrees have the advantage that they are compositional in the same | ||
479 : | way as the environment data structure itself is compositional. | ||
480 : | As a result, modtrees never hang on to parts of an environment that | ||
481 : | has already been rendered "stale" by filtering or rebinding. | ||
482 : | |||
483 : | - I went through many, many trials and errors before arriving at the | ||
484 : | current solution. (The initial idea of "linkpaths" did not work.) | ||
485 : | But the result of all this is that I have touched a lot of files that | ||
486 : | depend on the "modules" and "types" data structures (most of the | ||
487 : | elaborator). There were a lot of changes during my "linkpath" trials | ||
488 : | that could have been reverted to their original state but weren't. | ||
489 : | Please, don't be too harsh on me for messing with this code a bit more | ||
490 : | than what was strictly necessary... (I _did_ resist the tempation | ||
491 : | of doing any "global reformatting" to avoid an untimely death at | ||
492 : | Dave's hands. :) | ||
493 : | |||
494 : | - One positive aspect of the previous point: At least I made sure that | ||
495 : | all files that I touched now compile without warnings (other than | ||
496 : | "polyEqual"). | ||
497 : | |||
498 : | - compiler now tends to run "leaner" (i.e., ties up less memory in | ||
499 : | redundant modmaps) | ||
500 : | |||
501 : | ---------------------------------------------------------------------- | ||
502 : | leunga | 580 | Name: Allen Leung |
503 : | leunga | 585 | Date: 2000/03/29 18:00:00 |
504 : | Tag: leunga-20000327-mlriscGen_hppa_alpha_x86 | ||
505 : | leunga | 586 | Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz |
506 : | leunga | 585 | Description: |
507 : | |||
508 : | This update contains *MAJOR* changes to the way code is generated from CPS | ||
509 : | in the module mlriscGen, and in various backend modules. | ||
510 : | |||
511 : | CHANGES | ||
512 : | ======= | ||
513 : | |||
514 : | 1. MLRiscGen: forward propagation fix. | ||
515 : | |||
516 : | There was a bug in forward propagation introduced at about the same time | ||
517 : | as the MLRISC x86 backend, which prohibits coalescing to be | ||
518 : | performed effectively in loops. | ||
519 : | |||
520 : | Effect: speed up of loops in RISC architectures. | ||
521 : | By itself, this actually slowed down certain benchmarks on the x86. | ||
522 : | |||
523 : | 2. MLRiscGen: forward propagating addresses from consing. | ||
524 : | |||
525 : | I've changed the way consing code is generated. Basically I separated | ||
526 : | out the initialization part: | ||
527 : | |||
528 : | store tag, offset(allocptr) | ||
529 : | store elem1, offset+4(allocptr) | ||
530 : | store elem2, offset+8(allocptr) | ||
531 : | ... | ||
532 : | store elemn, offset+4n(allocptr) | ||
533 : | |||
534 : | and the address computation part: | ||
535 : | |||
536 : | celladdr <- offset+4+alloctpr | ||
537 : | |||
538 : | and move the address computation part | ||
539 : | |||
540 : | Effect: register pressure is generally lower as a result. This | ||
541 : | makes compilation of certain expressions much faster, such as | ||
542 : | long lists with non-trivial elements. | ||
543 : | |||
544 : | [(0,0), (0,0), .... (0,0)] | ||
545 : | |||
546 : | 3. MLRiscGen: base pointer elimination. | ||
547 : | |||
548 : | As part of the linkage mechanism, we generate the sequence: | ||
549 : | |||
550 : | L: ... <- start of the code fragment | ||
551 : | |||
552 : | L1: | ||
553 : | base pointer <- linkreg - L1 + L | ||
554 : | |||
555 : | The base pointer was then used for computing relocatable addresses | ||
556 : | in the code fragment. Frequently (such as in lots of continuations) | ||
557 : | this is not needed. We now eliminate this sequence whenever possible. | ||
558 : | |||
559 : | For compile time efficiency, I'm using a very stupid local heuristic. | ||
560 : | But in general, this should be done as a control flow analysis. | ||
561 : | |||
562 : | Effect: Smaller code size. Speed up of most programs. | ||
563 : | |||
564 : | 4. Hppa back end | ||
565 : | |||
566 : | Long jumps in span dependence resolution used to depend on the existence | ||
567 : | of the base pointer. | ||
568 : | |||
569 : | A jump to a long label L was expanded into the following sequence: | ||
570 : | |||
571 : | LDIL %hi(L-8192), %r29 | ||
572 : | LDO %lo(L-8192)(%r29), %r29 | ||
573 : | ADD %r29, baseptr, %r29 | ||
574 : | BV,n %r0(%r29) | ||
575 : | |||
576 : | In the presence of change (3) above, this will not work. I've changed | ||
577 : | it so that the following sequence of instructions are generated, which | ||
578 : | doesn't mention the base pointer at all: | ||
579 : | |||
580 : | BL,n L', %r29 /* branch and link, L' + 4 -> %r29 */ | ||
581 : | L': ADDIL L-(L'+4), %r29 /* Compute address of L */ | ||
582 : | BV,n %r0(%r29) /* Jump */ | ||
583 : | |||
584 : | 5. Alpha back end | ||
585 : | |||
586 : | New alpha instructions LDB/LDW have been added, as per Fermin's | ||
587 : | suggestions. This is unrelated to all other changes. | ||
588 : | |||
589 : | 6. X86 back end | ||
590 : | |||
591 : | I've changed andl to testl in the floating point test sequence | ||
592 : | whenever appropriate. The Intel optimization guide states that | ||
593 : | testl is perferable to andl. | ||
594 : | |||
595 : | 7. RA (x86 only) | ||
596 : | |||
597 : | I've improved the spill propagation algorithm, using an approximation | ||
598 : | of maximal weighted independent sets. This seems to be necessary to | ||
599 : | alleviate the negative effect in light of the slow down in (1). | ||
600 : | |||
601 : | I'll write down the algorithm one of these days. | ||
602 : | |||
603 : | 8. MLRiscGen: frequencies | ||
604 : | |||
605 : | I've added an annotation that states that all call gc blocks have zero | ||
606 : | execution frequencies. This improves register allocation on the x86. | ||
607 : | |||
608 : | BENCHMARKS | ||
609 : | ========== | ||
610 : | |||
611 : | I've only perform the comparison on 110.25. | ||
612 : | |||
613 : | The platforms are: | ||
614 : | |||
615 : | HPPA A four processor HP machine (E9000) with 5G of memory. | ||
616 : | X86 A 300Hhz Pentium II with 128M of memory, and | ||
617 : | SPARC An Ultra sparc 2 with 512M of memory. | ||
618 : | |||
619 : | I used the following parameters for the SML benchmarks: | ||
620 : | |||
621 : | @SMLalloc | ||
622 : | HPPA 256k | ||
623 : | SPARC 512k | ||
624 : | X86 256k | ||
625 : | |||
626 : | COMPILATION TIME | ||
627 : | ---------------- | ||
628 : | Here are the numbers comparing the compilation times of the compilers. | ||
629 : | I've only compared 110.25 compiling the new sources versus | ||
630 : | a fixpoint version of the new compiler compiling the same. | ||
631 : | |||
632 : | 110.25 New | ||
633 : | Total Time in RA Spill+Reload Total Time In RA Spill+Reload | ||
634 : | HPPA 627s 116s 2684+3584 599s 95s 1003+1879 | ||
635 : | SPARC 892s 173s 2891+3870 708s 116s 1004+1880 | ||
636 : | X86 999s 315s 94006+130691 987s 296s 108877+141957 | ||
637 : | |||
638 : | 110.25 New | ||
639 : | Code Size Code Size | ||
640 : | HPPA 8596736 8561421 | ||
641 : | SPARC 8974299 8785143 | ||
642 : | X86 9029180 8716783 | ||
643 : | |||
644 : | So in summary, things are at least as good as before. Dramatic | ||
645 : | reduction in compilation is obtained on the Sparc; I can't explain it, | ||
646 : | but it is reproducible. Perhaps someone should try to reproduce this | ||
647 : | on their own machines. | ||
648 : | |||
649 : | SML BENCHMARKS | ||
650 : | -------------- | ||
651 : | |||
652 : | On the average, all benchmarks perform at least as well as before. | ||
653 : | |||
654 : | HPPA Compilation Time Spill+Reload Run Time | ||
655 : | 110.25 New 110.25 New 110.25 New | ||
656 : | |||
657 : | barnesHut 3.158 3.015 4.75% 1+1 0+0 2.980 2.922 2.00% | ||
658 : | boyer 6.152 5.708 7.77% 0+0 0+0 0.218 0.213 2.34% | ||
659 : | count-graphs 1.168 1.120 4.32% 0+0 0+0 22.705 23.073 -1.60% | ||
660 : | fft 0.877 0.792 10.74% 1+3 1+3 0.602 0.587 2.56% | ||
661 : | knuthBendix 3.180 2.857 11.32% 0+0 0+0 0.675 0.662 2.02% | ||
662 : | lexgen 6.190 5.290 17.01% 0+0 0+0 0.913 0.788 15.86% | ||
663 : | life 0.803 0.703 14.22% 25+25 0+0 0.153 0.140 9.52% | ||
664 : | logic 2.048 2.007 2.08% 6+6 1+1 4.133 4.008 3.12% | ||
665 : | mandelbrot 0.077 0.080 -4.17% 0+0 0+0 0.765 0.712 7.49% | ||
666 : | mlyacc 22.932 20.937 9.53% 154+181 32+57 0.468 0.430 8.91% | ||
667 : | nucleic 5.183 5.060 2.44% 2+2 0+0 0.125 0.120 4.17% | ||
668 : | ratio-regions 3.357 3.142 6.84% 0+0 0+0 116.225 113.173 2.70% | ||
669 : | ray 1.283 1.290 -0.52% 0+0 0+0 2.887 2.855 1.11% | ||
670 : | simple 6.307 6.032 4.56% 28+30 5+7 3.705 3.658 1.28% | ||
671 : | tsp 0.888 0.862 3.09% 0+0 0+0 7.040 6.893 2.13% | ||
672 : | vliw 24.378 23.455 3.94% 106+127 25+45 2.758 2.707 1.91% | ||
673 : | -------------------------------------------------------------------------- | ||
674 : | Average 6.12% 4.09% | ||
675 : | |||
676 : | SPARC Compilation Time Spill+Reload Run Time | ||
677 : | 110.25 New 110.25 New 110.25 New | ||
678 : | |||
679 : | barnesHut 3.778 3.592 5.20% 2+2 0+0 3.648 3.453 5.65% | ||
680 : | boyer 6.632 6.110 8.54% 0+0 0+0 0.258 0.242 6.90% | ||
681 : | count-graphs 1.435 1.325 8.30% 0+0 0+0 33.672 34.737 -3.07% | ||
682 : | fft 0.980 0.940 4.26% 3+9 2+6 0.838 0.827 1.41% | ||
683 : | knuthBendix 3.590 3.138 14.39% 0+0 0+0 0.962 0.967 -0.52% | ||
684 : | lexgen 6.593 6.072 8.59% 1+1 0+0 1.077 1.078 -0.15% | ||
685 : | life 0.972 0.868 11.90% 26+26 0+0 0.143 0.140 2.38% | ||
686 : | logic 2.525 2.387 5.80% 7+7 1+1 5.625 5.158 9.05% | ||
687 : | mandelbrot 0.090 0.093 -3.57% 0+0 0+0 0.855 0.728 17.39% | ||
688 : | mlyacc 26.732 23.827 12.19% 162+189 32+57 0.550 0.560 -1.79% | ||
689 : | nucleic 6.233 6.197 0.59% 3+3 0+0 0.163 0.173 -5.77% | ||
690 : | ratio-regions 3.780 3.507 7.79% 0+0 0+0 133.993 131.035 2.26% | ||
691 : | ray 1.595 1.550 2.90% 1+1 0+0 3.440 3.418 0.63% | ||
692 : | simple 6.972 6.487 7.48% 29+32 5+7 3.523 3.525 -0.05% | ||
693 : | tsp 1.115 1.063 4.86% 0+0 0+0 7.393 7.265 1.77% | ||
694 : | vliw 27.765 24.818 11.87% 110+135 25+45 2.265 2.135 6.09% | ||
695 : | ---------------------------------------------------------------------------- | ||
696 : | Average 6.94% 2.64% | ||
697 : | |||
698 : | X86 Compilation Time Spill+Reload Run Time | ||
699 : | 110.25 New 110.25 New 110.25 New | ||
700 : | |||
701 : | barnesHut 5.530 5.420 2.03% 593+893 597+915 3.532 3.440 2.66% | ||
702 : | boyer 8.768 7.747 13.19% 493+199 301+289 0.327 0.297 10.11% | ||
703 : | count-graphs 2.040 2.010 1.49% 298+394 315+457 26.578 28.660 -7.26% | ||
704 : | fft 1.327 1.302 1.92% 112+209 115+210 1.055 0.962 9.71% | ||
705 : | knuthBendix 5.218 5.475 -4.69% 451+598 510+650 0.928 0.932 -0.36% | ||
706 : | lexgen 9.970 9.623 3.60% 1014+841 1157+885 0.947 0.928 1.97% | ||
707 : | life 1.183 1.183 0.00% 162+182 145+148 0.127 0.103 22.58% | ||
708 : | logic 3.285 3.512 -6.45% 514+684 591+836 5.682 5.577 1.88% | ||
709 : | mandelbrot 0.147 0.143 2.33% 38+41 33+54 0.703 0.690 1.93% | ||
710 : | mlyacc 35.457 32.763 8.22% 3496+4564 3611+4860 0.552 0.550 0.30% | ||
711 : | nucleic 7.100 6.888 3.07% 239+168 201+158 0.175 0.173 0.96% | ||
712 : | ratio-regions 6.388 6.843 -6.65% 1182+257 981+300 120.142 120.345 -0.17% | ||
713 : | ray 2.332 2.338 -0.29% 346+398 402+494 3.593 3.540 1.51% | ||
714 : | simple 9.912 9.903 0.08% 1475+941 1579+1168 3.057 3.178 -3.83% | ||
715 : | tsp 1.623 1.532 5.98% 266+200 250+211 8.045 7.878 2.12% | ||
716 : | vliw 33.947 35.470 -4.29% 2629+2774 2877+3171 2.072 1.890 9.61% | ||
717 : | ---------------------------------------------------------------------------- | ||
718 : | Average 1.22% 3.36% | ||
719 : | |||
720 : | ---------------------------------------------------------------------- | ||
721 : | Name: Allen Leung | ||
722 : | leunga | 583 | Date: 2000/03/23 16:25:00 |
723 : | Tag: leunga-20000323-fix_x86_alpha | ||
724 : | Description: | ||
725 : | |||
726 : | 1. X86 fixes/changes | ||
727 : | |||
728 : | a. The old code generated for SETcc was completely wrong. | ||
729 : | The Intel optimization guide is VERY misleading. | ||
730 : | |||
731 : | 2. ALPHA fixes/changes | ||
732 : | |||
733 : | a. Added the instructions LDBU, LDWU, STB, STW as per Fermin's suggestion. | ||
734 : | b. Added a new mode byteWordLoadStores to the functor parameter to Alpha() | ||
735 : | c. Added reassociation code for address computation. | ||
736 : | |||
737 : | ---------------------------------------------------------------------- | ||
738 : | Name: Allen Leung | ||
739 : | leunga | 580 | Date: 2000/03/22 01:23:00 |
740 : | Tag: leunga-20000322-fix_x86_hppa_ra | ||
741 : | Description: | ||
742 : | |||
743 : | 1. X86 fixes/changes | ||
744 : | |||
745 : | a. x86Rewrite bug with MUL3 (found by Lal) | ||
746 : | b. Added the instructions FSTS, FSTL | ||
747 : | |||
748 : | 2. PA-RISC fixes/changes | ||
749 : | |||
750 : | a. B label should not be a delay slot candidate! Why did this work? | ||
751 : | b. ADDT(32, REG(32, r), LI n) now generates one instruction instead of two, | ||
752 : | as it should be. | ||
753 : | c. The assembly syntax for fstds and fstdd was wrong. | ||
754 : | d. Added the composite instruction COMICLR/LDO, which is the immediate | ||
755 : | operand variant of COMCLR/LDO. | ||
756 : | |||
757 : | 3. Generic MLRISC | ||
758 : | |||
759 : | a. shuffle.sml rewritten to be slightly more efficient | ||
760 : | b. DIV bug in mltree-simplify fixed (found by Fermin) | ||
761 : | |||
762 : | 4. Register Allocator | ||
763 : | |||
764 : | a. I now release the interference graph earlier during spilling. | ||
765 : | May improve memory usage. | ||
766 : | |||
767 : | ---------------------------------------------------------------------- | ||
768 : | blume | 577 | Name: Matthias Blume |
769 : | blume | 578 | Date: 2000/03/14 14:15:32 |
770 : | Tag: blume_main_v110p26p1_2 | ||
771 : | Description: | ||
772 : | |||
773 : | 1. Tools.registerStdShellCmdTool (from smlnj/cm/tool.cm) takes an | ||
774 : | additional argument called "template" which is an optional string that | ||
775 : | specifiel the layout of the tool command line. See the CM manual for | ||
776 : | explanation. | ||
777 : | |||
778 : | 2. A special-purpose tool can be "regisitered" by simply dropping the | ||
779 : | corresponding <...>-tool.cm (and/or <...>-ext.cm) into the same | ||
780 : | directory where the .cm file lives that uses this tool. (The | ||
781 : | behavior/misfeature until now was to look for the tool description | ||
782 : | files in the current working directory.) As before, tool description | ||
783 : | files could also be anchored -- in which case they can live anywhere | ||
784 : | they like. Following the recent e-mail discussion, this change should | ||
785 : | make it easier to have special-purpose tools that are shipped together | ||
786 : | with the sources of the program that uses them. | ||
787 : | |||
788 : | ---------------------------------------------------------------------- | ||
789 : | Name: Matthias Blume | ||
790 : | blume | 577 | Date: 2000/03/10 07:48:34 |
791 : | Tag: blume_main_v110p26p1_1 | ||
792 : | Description: | ||
793 : | |||
794 : | I added a re-written version of Dave's fixpt script to src/system. | ||
795 : | Changes relative to the original version: | ||
796 : | - sh-ified (not everybody has ksh) | ||
797 : | - automatically figures out which architecture it runs on | ||
798 : | - uses ./makeml a bit more cleverly | ||
799 : | - never invokes ./installml (and, thus, does not clobber your | ||
800 : | good and working installation of sml in case something goes wrong) | ||
801 : | - accepts max iteration count using option "-iter <n>" | ||
802 : | - accepts a "base" name using option "-base <base>" | ||
803 : | |||
804 : | It does not build any extraneous heap images but directly rebuilds | ||
805 : | bin- and boot-hierarchies using makeml's "-rebuild" switch. Finally, | ||
806 : | it can incorporate existing bin- and boot- hierarchies. For example, | ||
807 : | suppose the base is set to "sml" (which is the default). Then it | ||
808 : | successively builds | ||
809 : | |||
810 : | sml.bin.<arch>-unix and sml.boot.<arch>-unix | ||
811 : | then sml1.bin.<arch>-unix and sml1.boot.<arch>-unix | ||
812 : | then sml2.bin.<arch>-unix and sml2.boot.<arch>-unix | ||
813 : | ... | ||
814 : | then sml<n>.bin.<arch>-unix and sml<n>.boot.<arch>-unix | ||
815 : | |||
816 : | and so on. If any of these already exist, it will just use what's | ||
817 : | there. In particular, many people will have the initial set of bin | ||
818 : | and boot files around, so this saves time for at least one full | ||
819 : | rebuild. Having sets of the form <base><k>.{bin,boot}.<arch>-unix for | ||
820 : | <k>=1,2,... is normally not a good idea when invoking fixpt. However, | ||
821 : | they might be the result of an earlier partial run of fixpt (which | ||
822 : | perhaps got accidentially killed). In this case, fixpt will quickly | ||
823 : | move through what exists before continuing where it left off earlier, | ||
824 : | and, thus, saves a lot of time. | ||
825 : | |||
826 : | ---------------------------------------------------------------------- | ||
827 : | leunga | 576 | Name: Allen Leung |
828 : | Date: 00/03/10 02:20:00 | ||
829 : | Tag: leunga-20000310-fix_x86_asm_ra | ||
830 : | Description: | ||
831 : | dbm | 570 | |
832 : | leunga | 576 | More assembly output problems involving the indexed addressing mode |
833 : | on the x86 have been found and corrected. Thanks to Fermin Reig for the | ||
834 : | fix. | ||
835 : | |||
836 : | The interface and implementation of the register allocator have been changed | ||
837 : | slightly to accommodate the possibility to skip the register allocation | ||
838 : | phases completely and go directly to memory allocation. This is needed | ||
839 : | for C-- use. | ||
840 : | |||
841 : | dbm | 570 | ---------------------------------------------------------------------- |
842 : | blume | 572 | Name: Matthias Blume |
843 : | blume | 575 | Date: 00/03/09 10:23:53 |
844 : | Tag: blume_main_v110p26p1_0 | ||
845 : | Description: | ||
846 : | |||
847 : | * Complete re-organization of library names. Many libraries have been | ||
848 : | consolidated so that they share the same path anchor. For example, | ||
849 : | all MLRISC-related libraries are anchored at MLRISC, most libraries that | ||
850 : | are SML/NJ-specific are under "smlnj". Notice that names like | ||
851 : | host-cmb.cm or host-compiler.cm no longer exist. See system/README | ||
852 : | for a complete description of the new naming scheme. Quick reference: | ||
853 : | |||
854 : | host-cmb.cm -> smlnj/cmb.cm | ||
855 : | host-compiler.cm -> smlnj/compiler.cm | ||
856 : | full-cm.cm -> smlnj/cm.cm | ||
857 : | <arch>-<os>.cm -> smlnj/cmb/<arch>-<os>.cm | ||
858 : | <arch>-compiler.cm -> smlnj/compiler/<arch>.cm | ||
859 : | |||
860 : | * Bug fixes in CM. | ||
861 : | - exceptions in user code are being passed through (i.e., reach top level) | ||
862 : | - more bugs in paranoia mode fixed | ||
863 : | - bug related to checking group owners fixed | ||
864 : | |||
865 : | * New install.sh script that automagically fetches archive files: | ||
866 : | The new file config/srcarchiveurl must contain the URL of the | ||
867 : | (remote) directory that contains bin files (or other source archives). | ||
868 : | If install.sh does not find the archive locally, it tries to get | ||
869 : | it from that remote directory. | ||
870 : | This should simplify installation further: For machines that have | ||
871 : | access to the internet, just fetch <version>-config.tgz, unpack it, | ||
872 : | edit config/targets, and go (run config/install.sh). The scipt will | ||
873 : | fetch everything else that it might need all by itself. | ||
874 : | |||
875 : | For CVS users, this mechanism is not relevant for source archives, but | ||
876 : | it is convenient for getting new sets of binfiles. | ||
877 : | |||
878 : | Archives should be tar files compressed with either gzip, compress, or | ||
879 : | bzip2. The script recognizes .tgz, .tar, tar.gz, tz, .tar.Z, and .tar.bz2. | ||
880 : | |||
881 : | ---------------------------------------------------------------------- | ||
882 : | Name: Matthias Blume | ||
883 : | blume | 572 | Date: 2000/03/07 04:01:04 |
884 : | Tag: blume_main_v110_26_2 | ||
885 : | dbm | 570 | Description: |
886 : | blume | 572 | - size info in BOOTLIST |
887 : | * no fixed upper limits for number of bootfiles or length of | ||
888 : | bootfile names in runtime | ||
889 : | * falling back to old behavior if no BOOTLIST size info found | ||
890 : | - allocation size heuristics in .run-sml | ||
891 : | * tries to read cache size from /proc/cpuinfo (this is important for | ||
892 : | small-cache Celeron systems!) | ||
893 : | - install.sh robustified | ||
894 : | - CM manual updates | ||
895 : | - paranoid mode | ||
896 : | * no more CMB.deliver() (i.e., all done by CMB.make()) | ||
897 : | * can re-use existing sml.boot.* files | ||
898 : | * init.cmi now treated as library | ||
899 : | * library stamps for consistency checks | ||
900 : | - sml.boot.<arch>-<os>/PIDMAP file | ||
901 : | * This file is read by the CM startup code. This is used to minimize | ||
902 : | the amount of dynamic state that needs to be stowed away for the | ||
903 : | purpose of sharing between interactive system and user code. | ||
904 : | - CM.Anchor.anchor instead of CM.Anchor.{set,cancel} | ||
905 : | * Upon request by Elsa. Anchors now controlled by get-set-pair | ||
906 : | like most other CM state variables. | ||
907 : | - Compiler.CMSA eliminated | ||
908 : | * No longer supported by CM anyway. | ||
909 : | - fixed bugs in pickler that kept biting Stefan | ||
910 : | * past refs to past refs (was caused by the possibility that | ||
911 : | ad-hoc sharing is more discriminating than hash-cons sharing) | ||
912 : | * integer overflow on LargeInt.minInt | ||
913 : | - ml-{lex,yacc} build scripts now use new mechanism | ||
914 : | for building standalone programs | ||
915 : | - fixed several gcc -Wall warnings that were caused by missing header | ||
916 : | files, missing initializations, etc., in runtime (not all warnings | ||
917 : | eliminated, though) |
root@smlnj-gforge.cs.uchicago.edu | ViewVC Help |
Powered by ViewVC 1.0.0 |