SCM Repository
Annotation of /sml/trunk/HISTORY
Parent Directory
|
Revision Log
Revision 627 - (view) (download)
1 : | dbm | 570 | This is the HISTORY file for the Yale SML/NJ CVS repository. |
2 : | |||
3 : | An entry should be made for _every_ commit to the repository. | ||
4 : | The entries in this file will be used when creating the README | ||
5 : | for new versions, so keep that in mind when writing the | ||
6 : | description. | ||
7 : | |||
8 : | The form of an entry should be: | ||
9 : | |||
10 : | Name: | ||
11 : | Date: | ||
12 : | Tag: <post-commit CVS tag> | ||
13 : | Description: | ||
14 : | leunga | 591 | ---------------------------------------------------------------------- |
15 : | dbm | 627 | Name: Dave MacQueen |
16 : | Date: 2000/04/25/ | ||
17 : | Tag: dbm-20000425-mlyacc_doc_examples | ||
18 : | Description: | ||
19 : | Updated mlyacc.tex sections 5 and 7 for SML '97 and CM. | ||
20 : | Updated all three examples in src/ml-yacc/examples to run | ||
21 : | under 110.* using CM.make. | ||
22 : | |||
23 : | ---------------------------------------------------------------------- | ||
24 : | leunga | 624 | Name: Allen Leung |
25 : | Date: 2000/04/20 23:04:00 EST | ||
26 : | Tag: leunga-20000420-ssa-c---stuff | ||
27 : | Description: | ||
28 : | |||
29 : | This update synchronizes my repository with Yale's. Most of these | ||
30 : | changes, however, do not affect SML/NJ at all (the RA is an exception). | ||
31 : | |||
32 : | 1. Register Allocator | ||
33 : | |||
34 : | a. An improvement in the interference graph construction: | ||
35 : | Given a copy | ||
36 : | |||
37 : | s <- t | ||
38 : | |||
39 : | no interference edge between s and t are added for this definition of s. | ||
40 : | |||
41 : | b. I've added two new spill heuristic modules that Fermin and I developed | ||
42 : | (in the new library RA.cm). These are unused in SML/NJ but maybe | ||
43 : | useful for others (Moby?) | ||
44 : | |||
45 : | 2. X86 | ||
46 : | |||
47 : | a. Various fixes in the backend provided by Fermin [C--] and Lal. | ||
48 : | |||
49 : | 3. Alpha | ||
50 : | |||
51 : | a. Added the BSR instruction and code generation that goes with it [C--] | ||
52 : | b. Other fixes too numerous to recount provided by Fermin [C--] | ||
53 : | |||
54 : | 4. Regmaps | ||
55 : | |||
56 : | a. The regmaps are not initialized with the identity physical bindings | ||
57 : | at creation time. This is unneeded. | ||
58 : | |||
59 : | 5. MLRISC Optimizations | ||
60 : | |||
61 : | a. The DJ-Graph module can now compute the iterated dominance frontiers | ||
62 : | intersects with liveness incrementally in linear time! Woohoo! | ||
63 : | This is now used in my new SSA construction algorithm. | ||
64 : | |||
65 : | b. THe branch reorganization module is now smarter about linear chains of | ||
66 : | basic blocks. | ||
67 : | |||
68 : | |||
69 : | ---------------------------------------------------------------------- | ||
70 : | blume | 615 | Name: Matthias Blume |
71 : | Date: 2000/04/12 13:52:00 JST | ||
72 : | Tag: blume_main_v110p27_1 | ||
73 : | Description: | ||
74 : | |||
75 : | Changed install.sh script to handle archive files without version number | ||
76 : | and to use "boot.<arch>-<os>" instead of "sml.boot.<arch>-<os>" for the | ||
77 : | name of the boot file archive. | ||
78 : | |||
79 : | ---------------------------------------------------------------------- | ||
80 : | dbm | 607 | Name: Dave MacQueen |
81 : | Date: 2000/04/09 14:00 EDT | ||
82 : | Tag: dbm-20000410-Version_110_27 | ||
83 : | Description: | ||
84 : | |||
85 : | 1. Updated src/compiler/TopLevel/main/version.sml to version 110.27 | ||
86 : | |||
87 : | 2. Updated src/config/version to 110.27 | ||
88 : | |||
89 : | 3. New boot files! | ||
90 : | |||
91 : | ---------------------------------------------------------------------- | ||
92 : | leunga | 606 | Name: Allen Leung |
93 : | Date: 2000/04/09 19:09:00 EST | ||
94 : | Tag: leunga-20000409-misc | ||
95 : | Description: | ||
96 : | |||
97 : | 1. Yet another fix for x86 assembly for idivl, imull, mull and friends. | ||
98 : | |||
99 : | 2. Miscellaneous improvements to MLRISC (unused in sml/nj) | ||
100 : | |||
101 : | ---------------------------------------------------------------------- | ||
102 : | monnier | 605 | Name: Stefan |
103 : | Date: 2000/04/07 10:00:00 EDT | ||
104 : | Tag: monnier-20000406-branch-handling | ||
105 : | Description: | ||
106 : | |||
107 : | Improved handling of branches (mostly those generated from | ||
108 : | polymorphic equality), removed switchoff and changed the | ||
109 : | default optimization settings (more cpsopt and less flintopt). | ||
110 : | |||
111 : | ---------------------------------------------------------------------- | ||
112 : | leunga | 601 | Name: Allen Leung |
113 : | leunga | 602 | Date: 2000/04/06 01:30:00 EST |
114 : | Tag: leunga-20000406-peephole-x86-SSA-2 | ||
115 : | Description: | ||
116 : | |||
117 : | Forgot a few files. | ||
118 : | |||
119 : | ---------------------------------------------------------------------- | ||
120 : | Name: Allen Leung | ||
121 : | leunga | 601 | Date: 2000/04/06 00:36:00 EST |
122 : | Tag: leunga-20000406-peephole-x86-SSA | ||
123 : | Description: | ||
124 : | |||
125 : | 1. New Peephole code | ||
126 : | |||
127 : | 2. Minor improvement to X86 instruction selection | ||
128 : | |||
129 : | 3. Various fixes to SSA and machine description -> code translator | ||
130 : | |||
131 : | ---------------------------------------------------------------------- | ||
132 : | blume | 595 | Name: Matthias Blume |
133 : | Date: 2000/04/05 12:30:00 JST | ||
134 : | Tag: blume_main_v110p26p2_3 | ||
135 : | Description: | ||
136 : | |||
137 : | This update just merges three minor cosmetic updates to CM's sources | ||
138 : | to get ready for the 110.27 code freeze on Friday. No functionality | ||
139 : | has changed. | ||
140 : | |||
141 : | ---------------------------------------------------------------------- | ||
142 : | leunga | 593 | Name: Allen Leung |
143 : | leunga | 594 | Date: 2000/04/04 19:39:00 EST |
144 : | Tag: leunga-20000404-x86-asm | ||
145 : | Description: | ||
146 : | |||
147 : | 1. Fixed a problem in X86 assembly. | ||
148 : | |||
149 : | Things like | ||
150 : | |||
151 : | jmp %eax | ||
152 : | jmp (%eax) | ||
153 : | |||
154 : | should be output as | ||
155 : | |||
156 : | jmp *%eax | ||
157 : | jmp *(%eax) | ||
158 : | |||
159 : | 2. Assembly output | ||
160 : | |||
161 : | Added a new flag | ||
162 : | |||
163 : | "asm-indent-copies" (default to false) | ||
164 : | |||
165 : | When this flag is on, parallel copies will be indented an extra level. | ||
166 : | |||
167 : | ---------------------------------------------------------------------- | ||
168 : | Name: Allen Leung | ||
169 : | leunga | 593 | Date: 2000/04/04 03:18:00 EST |
170 : | Tag: leunga-20000404-C--Moby | ||
171 : | Description: | ||
172 : | |||
173 : | All of these fixes are related to C--, Moby, and my own optimization | ||
174 : | stuff; so they shouldn't affect SML/NJ. | ||
175 : | |||
176 : | 1. X86 | ||
177 : | |||
178 : | Various fixes related floating point, and extensions. | ||
179 : | |||
180 : | 2. Alpha | ||
181 : | |||
182 : | Some extra patterns related to loads with signed/zero extension | ||
183 : | provided by Fermin. | ||
184 : | |||
185 : | 3. Assembly | ||
186 : | |||
187 : | When generating assemby, resolve the value of client defined constants, | ||
188 : | instead of generating symbolic values. This is controlled by the | ||
189 : | new flag "asm-resolve-constants", which is default to true. | ||
190 : | |||
191 : | 4. Machine Descriptions | ||
192 : | |||
193 : | a. The precedence parser was slightly broken when parsing infixr symbols. | ||
194 : | b. The type generalizing code had the bound variables reversed, resulting | ||
195 : | in a problem during arity raising. | ||
196 : | c. Various fixes in machine descriptions. | ||
197 : | |||
198 : | ---------------------------------------------------------------------- | ||
199 : | blume | 592 | Name: Matthias Blume |
200 : | Date: 2000/04/03 16:05:00 JST | ||
201 : | Tag: blume_main_v110p26p2_2 | ||
202 : | Description: | ||
203 : | |||
204 : | I eliminated coreEnv from compInfo. Access to the "Core" structure is | ||
205 : | now done via the ordinary static environment that is context to each | ||
206 : | compilation unit. | ||
207 : | |||
208 : | To this end, I arranged that instead of "structure Core" as "structure | ||
209 : | _Core" is bound in the pervasive environment. Core access is done via | ||
210 : | _Core (which can never be accidentially rebound because _Core is not a | ||
211 : | legal surface-syntax symbol). | ||
212 : | |||
213 : | The current solution is much cleaner because the core environment is | ||
214 : | now simply part of the pervasive environment which is part of every | ||
215 : | compilation unit's context anyway. In particular, this eliminates all | ||
216 : | special-case handling that was necessary until now in order to deal | ||
217 : | with dynamic and symbolic parts of the core environment. | ||
218 : | |||
219 : | Remaining hackery (to bind the "magic" symbol _Core) is localized in the | ||
220 : | compilation mananger's bootstrap compiler (actually: in the "init group" | ||
221 : | handling). See the comments in src/system/smlnj/init/init.cmi for | ||
222 : | more details. | ||
223 : | |||
224 : | I also tried to track down all mentions of "Core" (as string argument | ||
225 : | to Symbol.strSymbol) in the compiler and replaced them with a | ||
226 : | reference to the new CoreSym.coreSym. Seems cleaner since the actual | ||
227 : | name appears in one place only. | ||
228 : | |||
229 : | Binfile and bootfile format have not changed, but the switchover from | ||
230 : | the old "init.cmi" to the new one is a bit tricky, so I supplied new | ||
231 : | bootfiles anyway. | ||
232 : | |||
233 : | ---------------------------------------------------------------------- | ||
234 : | leunga | 591 | Name: Allen Leung |
235 : | Date: 2000/04/02 21:17:00 EST | ||
236 : | Tag: leunga-20000402-mltree | ||
237 : | Description: | ||
238 : | leunga | 585 | |
239 : | leunga | 591 | 1. Renamed the constructor CALL in MLTREE by popular demand. |
240 : | 2. Added a bunch of files from my repository. These are currently | ||
241 : | used by other non-SMLNJ backends. | ||
242 : | |||
243 : | leunga | 576 | ---------------------------------------------------------------------- |
244 : | leunga | 590 | Name: Allen Leung |
245 : | Date: 2000/03/31 21:15:00 EST | ||
246 : | Tag: leunga-20000331-aliasing | ||
247 : | Description: | ||
248 : | |||
249 : | This update contains a rewritten (and hopefully more correct) module | ||
250 : | for extracting aliasing information from CPS. | ||
251 : | |||
252 : | To turn on this feature: | ||
253 : | |||
254 : | Compiler.Control.CG.memDisambiguate := true | ||
255 : | |||
256 : | To pretty print the region information with assembly | ||
257 : | |||
258 : | Compiler.Control.MLRISC.getFlag "asm-show-region" := true; | ||
259 : | |||
260 : | To control how many levels of aliasing information are printed, use: | ||
261 : | |||
262 : | Compiler.Control.MLRISC.getInt "points-to-show-level" := n | ||
263 : | |||
264 : | The default of n is 3. | ||
265 : | |||
266 : | ---------------------------------------------------------------------- | ||
267 : | dbm | 589 | Name: David MacQueen |
268 : | Date: 2000/03/31 11:15:00 EST | ||
269 : | Tag: dbm-20000331-runtime_fix | ||
270 : | Description: | ||
271 : | |||
272 : | This update contains: | ||
273 : | |||
274 : | 1. runtime/c-lib/c-libraries.c | ||
275 : | includes added in revision 1.2 caused compilation errors on hppa-hpux | ||
276 : | |||
277 : | 2. fix for bug 1556 | ||
278 : | system/Basis/Implementation/NJ/internal-signals.sml | ||
279 : | |||
280 : | ---------------------------------------------------------------------- | ||
281 : | blume | 587 | Name: Matthias Blume |
282 : | blume | 588 | Date: 2000/03/31 18:00:00 JST |
283 : | Tag: blume_main_v110p26p2_1 | ||
284 : | Description: | ||
285 : | |||
286 : | This update contains: | ||
287 : | |||
288 : | 1. A small change to CM's handling of stable libraries: | ||
289 : | CM now maintains one "global" modmap that is used for all stable | ||
290 : | libraries. The use of such a global modmap maximizes sharing and | ||
291 : | minimizes the need for re-traversing parts of environments during | ||
292 : | modmap construction. (However, this has minor impact since modmap | ||
293 : | construction seems to account for just one percent or less of total | ||
294 : | compile time.) | ||
295 : | |||
296 : | 2. I added a "genmap" phase to the statistics. This is where I got the | ||
297 : | "one percent" number (see above). | ||
298 : | |||
299 : | 3. CM's new tool parameter mechanism just became _even_ better. :) | ||
300 : | - The parser understands named parameters and recursive options. | ||
301 : | - The "make" and "shell" tools use these new features. | ||
302 : | (This makes it a lot easier to cascade these tools.) | ||
303 : | - There is a small syntax change: named parameters use a | ||
304 : | |||
305 : | <name> : ( <option> ... ) or | ||
306 : | <name> : <string> | ||
307 : | |||
308 : | syntax. Previously, named parameters were implemented in an | ||
309 : | ad-hoc fashion by each tool individually (by parsing strings) | ||
310 : | and had the form | ||
311 : | |||
312 : | <name>=<string> | ||
313 : | |||
314 : | See the CM manual for a full description of these issues. | ||
315 : | |||
316 : | ---------------------------------------------------------------------- | ||
317 : | Name: Matthias Blume | ||
318 : | blume | 587 | Date: 2000/03/30 18:00:00 JST |
319 : | Tag: blume_main_v110p26p2_0 | ||
320 : | Description: | ||
321 : | |||
322 : | !!!!! WARNING !!!!!! | ||
323 : | !! New binfiles !! | ||
324 : | !!!!!!!!!!!!!!!!!!!! | ||
325 : | |||
326 : | This update contains: | ||
327 : | |||
328 : | 1. Moderate changes to CM: | ||
329 : | |||
330 : | - Changes to CM's tools mechanism. In particular, it is now possible | ||
331 : | to have tools that accept additional "command line" parameters | ||
332 : | (specified in the .cm file at each instance where the tool's class is | ||
333 : | used). | ||
334 : | |||
335 : | This was done to accomodate the new "make" and "shell" tools which | ||
336 : | facilitate fairly seemless hookup to portions of code managed using | ||
337 : | Makefiles or Shell scripts. | ||
338 : | |||
339 : | There are no classes "shared" or "private" anymore. Instead, the | ||
340 : | sharing annotation is now a parameter to the "sml" class. | ||
341 : | |||
342 : | There is a bit of generic machinery for implementing one's own | ||
343 : | tools that accept command-line parameters. However, I am not yet fully | ||
344 : | satisfied with that part, so expect changes here in the future. | ||
345 : | |||
346 : | All existing tools are described in the CM manual. | ||
347 : | |||
348 : | - Slightly better error handling. (CM now surpresses many followup | ||
349 : | error messages that tended to be more annoying than helpful.) | ||
350 : | |||
351 : | 2. Major changes to the compiler's static environment data structures. | ||
352 : | |||
353 : | - no CMStaticEnv anymore. | ||
354 : | - no CMEnv, no "BareEnvironment" (actually, _only_ BareEnvironment, | ||
355 : | but it is called Environment), no conversions between different | ||
356 : | kinds of static environments | ||
357 : | |||
358 : | - There is still a notion of a "modmap", but such modmaps are generated | ||
359 : | on demand at the time when they are needed. This sounds slow, but I | ||
360 : | sped up the code that generates modmaps enough for this not to lead to | ||
361 : | a slowdown of the compiler (at least I didn't detect any). | ||
362 : | |||
363 : | - To facilitate rapid modmap generation, static environments now | ||
364 : | contain an (optional) "modtree" structure. Modtree annotations are | ||
365 : | constructed by the unpickler during unpickling. (This means that | ||
366 : | the elaborator does not have to worry about modtrees at all.) | ||
367 : | Modtrees have the advantage that they are compositional in the same | ||
368 : | way as the environment data structure itself is compositional. | ||
369 : | As a result, modtrees never hang on to parts of an environment that | ||
370 : | has already been rendered "stale" by filtering or rebinding. | ||
371 : | |||
372 : | - I went through many, many trials and errors before arriving at the | ||
373 : | current solution. (The initial idea of "linkpaths" did not work.) | ||
374 : | But the result of all this is that I have touched a lot of files that | ||
375 : | depend on the "modules" and "types" data structures (most of the | ||
376 : | elaborator). There were a lot of changes during my "linkpath" trials | ||
377 : | that could have been reverted to their original state but weren't. | ||
378 : | Please, don't be too harsh on me for messing with this code a bit more | ||
379 : | than what was strictly necessary... (I _did_ resist the tempation | ||
380 : | of doing any "global reformatting" to avoid an untimely death at | ||
381 : | Dave's hands. :) | ||
382 : | |||
383 : | - One positive aspect of the previous point: At least I made sure that | ||
384 : | all files that I touched now compile without warnings (other than | ||
385 : | "polyEqual"). | ||
386 : | |||
387 : | - compiler now tends to run "leaner" (i.e., ties up less memory in | ||
388 : | redundant modmaps) | ||
389 : | |||
390 : | ---------------------------------------------------------------------- | ||
391 : | leunga | 580 | Name: Allen Leung |
392 : | leunga | 585 | Date: 2000/03/29 18:00:00 |
393 : | Tag: leunga-20000327-mlriscGen_hppa_alpha_x86 | ||
394 : | leunga | 586 | Boot files (optional): ftp://react-ilp.cs.nyu.edu/leunga/110.26.1-sml.boot.x86-unix-20000330.tar.gz |
395 : | leunga | 585 | Description: |
396 : | |||
397 : | This update contains *MAJOR* changes to the way code is generated from CPS | ||
398 : | in the module mlriscGen, and in various backend modules. | ||
399 : | |||
400 : | CHANGES | ||
401 : | ======= | ||
402 : | |||
403 : | 1. MLRiscGen: forward propagation fix. | ||
404 : | |||
405 : | There was a bug in forward propagation introduced at about the same time | ||
406 : | as the MLRISC x86 backend, which prohibits coalescing to be | ||
407 : | performed effectively in loops. | ||
408 : | |||
409 : | Effect: speed up of loops in RISC architectures. | ||
410 : | By itself, this actually slowed down certain benchmarks on the x86. | ||
411 : | |||
412 : | 2. MLRiscGen: forward propagating addresses from consing. | ||
413 : | |||
414 : | I've changed the way consing code is generated. Basically I separated | ||
415 : | out the initialization part: | ||
416 : | |||
417 : | store tag, offset(allocptr) | ||
418 : | store elem1, offset+4(allocptr) | ||
419 : | store elem2, offset+8(allocptr) | ||
420 : | ... | ||
421 : | store elemn, offset+4n(allocptr) | ||
422 : | |||
423 : | and the address computation part: | ||
424 : | |||
425 : | celladdr <- offset+4+alloctpr | ||
426 : | |||
427 : | and move the address computation part | ||
428 : | |||
429 : | Effect: register pressure is generally lower as a result. This | ||
430 : | makes compilation of certain expressions much faster, such as | ||
431 : | long lists with non-trivial elements. | ||
432 : | |||
433 : | [(0,0), (0,0), .... (0,0)] | ||
434 : | |||
435 : | 3. MLRiscGen: base pointer elimination. | ||
436 : | |||
437 : | As part of the linkage mechanism, we generate the sequence: | ||
438 : | |||
439 : | L: ... <- start of the code fragment | ||
440 : | |||
441 : | L1: | ||
442 : | base pointer <- linkreg - L1 + L | ||
443 : | |||
444 : | The base pointer was then used for computing relocatable addresses | ||
445 : | in the code fragment. Frequently (such as in lots of continuations) | ||
446 : | this is not needed. We now eliminate this sequence whenever possible. | ||
447 : | |||
448 : | For compile time efficiency, I'm using a very stupid local heuristic. | ||
449 : | But in general, this should be done as a control flow analysis. | ||
450 : | |||
451 : | Effect: Smaller code size. Speed up of most programs. | ||
452 : | |||
453 : | 4. Hppa back end | ||
454 : | |||
455 : | Long jumps in span dependence resolution used to depend on the existence | ||
456 : | of the base pointer. | ||
457 : | |||
458 : | A jump to a long label L was expanded into the following sequence: | ||
459 : | |||
460 : | LDIL %hi(L-8192), %r29 | ||
461 : | LDO %lo(L-8192)(%r29), %r29 | ||
462 : | ADD %r29, baseptr, %r29 | ||
463 : | BV,n %r0(%r29) | ||
464 : | |||
465 : | In the presence of change (3) above, this will not work. I've changed | ||
466 : | it so that the following sequence of instructions are generated, which | ||
467 : | doesn't mention the base pointer at all: | ||
468 : | |||
469 : | BL,n L', %r29 /* branch and link, L' + 4 -> %r29 */ | ||
470 : | L': ADDIL L-(L'+4), %r29 /* Compute address of L */ | ||
471 : | BV,n %r0(%r29) /* Jump */ | ||
472 : | |||
473 : | 5. Alpha back end | ||
474 : | |||
475 : | New alpha instructions LDB/LDW have been added, as per Fermin's | ||
476 : | suggestions. This is unrelated to all other changes. | ||
477 : | |||
478 : | 6. X86 back end | ||
479 : | |||
480 : | I've changed andl to testl in the floating point test sequence | ||
481 : | whenever appropriate. The Intel optimization guide states that | ||
482 : | testl is perferable to andl. | ||
483 : | |||
484 : | 7. RA (x86 only) | ||
485 : | |||
486 : | I've improved the spill propagation algorithm, using an approximation | ||
487 : | of maximal weighted independent sets. This seems to be necessary to | ||
488 : | alleviate the negative effect in light of the slow down in (1). | ||
489 : | |||
490 : | I'll write down the algorithm one of these days. | ||
491 : | |||
492 : | 8. MLRiscGen: frequencies | ||
493 : | |||
494 : | I've added an annotation that states that all call gc blocks have zero | ||
495 : | execution frequencies. This improves register allocation on the x86. | ||
496 : | |||
497 : | BENCHMARKS | ||
498 : | ========== | ||
499 : | |||
500 : | I've only perform the comparison on 110.25. | ||
501 : | |||
502 : | The platforms are: | ||
503 : | |||
504 : | HPPA A four processor HP machine (E9000) with 5G of memory. | ||
505 : | X86 A 300Hhz Pentium II with 128M of memory, and | ||
506 : | SPARC An Ultra sparc 2 with 512M of memory. | ||
507 : | |||
508 : | I used the following parameters for the SML benchmarks: | ||
509 : | |||
510 : | @SMLalloc | ||
511 : | HPPA 256k | ||
512 : | SPARC 512k | ||
513 : | X86 256k | ||
514 : | |||
515 : | COMPILATION TIME | ||
516 : | ---------------- | ||
517 : | Here are the numbers comparing the compilation times of the compilers. | ||
518 : | I've only compared 110.25 compiling the new sources versus | ||
519 : | a fixpoint version of the new compiler compiling the same. | ||
520 : | |||
521 : | 110.25 New | ||
522 : | Total Time in RA Spill+Reload Total Time In RA Spill+Reload | ||
523 : | HPPA 627s 116s 2684+3584 599s 95s 1003+1879 | ||
524 : | SPARC 892s 173s 2891+3870 708s 116s 1004+1880 | ||
525 : | X86 999s 315s 94006+130691 987s 296s 108877+141957 | ||
526 : | |||
527 : | 110.25 New | ||
528 : | Code Size Code Size | ||
529 : | HPPA 8596736 8561421 | ||
530 : | SPARC 8974299 8785143 | ||
531 : | X86 9029180 8716783 | ||
532 : | |||
533 : | So in summary, things are at least as good as before. Dramatic | ||
534 : | reduction in compilation is obtained on the Sparc; I can't explain it, | ||
535 : | but it is reproducible. Perhaps someone should try to reproduce this | ||
536 : | on their own machines. | ||
537 : | |||
538 : | SML BENCHMARKS | ||
539 : | -------------- | ||
540 : | |||
541 : | On the average, all benchmarks perform at least as well as before. | ||
542 : | |||
543 : | HPPA Compilation Time Spill+Reload Run Time | ||
544 : | 110.25 New 110.25 New 110.25 New | ||
545 : | |||
546 : | barnesHut 3.158 3.015 4.75% 1+1 0+0 2.980 2.922 2.00% | ||
547 : | boyer 6.152 5.708 7.77% 0+0 0+0 0.218 0.213 2.34% | ||
548 : | count-graphs 1.168 1.120 4.32% 0+0 0+0 22.705 23.073 -1.60% | ||
549 : | fft 0.877 0.792 10.74% 1+3 1+3 0.602 0.587 2.56% | ||
550 : | knuthBendix 3.180 2.857 11.32% 0+0 0+0 0.675 0.662 2.02% | ||
551 : | lexgen 6.190 5.290 17.01% 0+0 0+0 0.913 0.788 15.86% | ||
552 : | life 0.803 0.703 14.22% 25+25 0+0 0.153 0.140 9.52% | ||
553 : | logic 2.048 2.007 2.08% 6+6 1+1 4.133 4.008 3.12% | ||
554 : | mandelbrot 0.077 0.080 -4.17% 0+0 0+0 0.765 0.712 7.49% | ||
555 : | mlyacc 22.932 20.937 9.53% 154+181 32+57 0.468 0.430 8.91% | ||
556 : | nucleic 5.183 5.060 2.44% 2+2 0+0 0.125 0.120 4.17% | ||
557 : | ratio-regions 3.357 3.142 6.84% 0+0 0+0 116.225 113.173 2.70% | ||
558 : | ray 1.283 1.290 -0.52% 0+0 0+0 2.887 2.855 1.11% | ||
559 : | simple 6.307 6.032 4.56% 28+30 5+7 3.705 3.658 1.28% | ||
560 : | tsp 0.888 0.862 3.09% 0+0 0+0 7.040 6.893 2.13% | ||
561 : | vliw 24.378 23.455 3.94% 106+127 25+45 2.758 2.707 1.91% | ||
562 : | -------------------------------------------------------------------------- | ||
563 : | Average 6.12% 4.09% | ||
564 : | |||
565 : | SPARC Compilation Time Spill+Reload Run Time | ||
566 : | 110.25 New 110.25 New 110.25 New | ||
567 : | |||
568 : | barnesHut 3.778 3.592 5.20% 2+2 0+0 3.648 3.453 5.65% | ||
569 : | boyer 6.632 6.110 8.54% 0+0 0+0 0.258 0.242 6.90% | ||
570 : | count-graphs 1.435 1.325 8.30% 0+0 0+0 33.672 34.737 -3.07% | ||
571 : | fft 0.980 0.940 4.26% 3+9 2+6 0.838 0.827 1.41% | ||
572 : | knuthBendix 3.590 3.138 14.39% 0+0 0+0 0.962 0.967 -0.52% | ||
573 : | lexgen 6.593 6.072 8.59% 1+1 0+0 1.077 1.078 -0.15% | ||
574 : | life 0.972 0.868 11.90% 26+26 0+0 0.143 0.140 2.38% | ||
575 : | logic 2.525 2.387 5.80% 7+7 1+1 5.625 5.158 9.05% | ||
576 : | mandelbrot 0.090 0.093 -3.57% 0+0 0+0 0.855 0.728 17.39% | ||
577 : | mlyacc 26.732 23.827 12.19% 162+189 32+57 0.550 0.560 -1.79% | ||
578 : | nucleic 6.233 6.197 0.59% 3+3 0+0 0.163 0.173 -5.77% | ||
579 : | ratio-regions 3.780 3.507 7.79% 0+0 0+0 133.993 131.035 2.26% | ||
580 : | ray 1.595 1.550 2.90% 1+1 0+0 3.440 3.418 0.63% | ||
581 : | simple 6.972 6.487 7.48% 29+32 5+7 3.523 3.525 -0.05% | ||
582 : | tsp 1.115 1.063 4.86% 0+0 0+0 7.393 7.265 1.77% | ||
583 : | vliw 27.765 24.818 11.87% 110+135 25+45 2.265 2.135 6.09% | ||
584 : | ---------------------------------------------------------------------------- | ||
585 : | Average 6.94% 2.64% | ||
586 : | |||
587 : | X86 Compilation Time Spill+Reload Run Time | ||
588 : | 110.25 New 110.25 New 110.25 New | ||
589 : | |||
590 : | barnesHut 5.530 5.420 2.03% 593+893 597+915 3.532 3.440 2.66% | ||
591 : | boyer 8.768 7.747 13.19% 493+199 301+289 0.327 0.297 10.11% | ||
592 : | count-graphs 2.040 2.010 1.49% 298+394 315+457 26.578 28.660 -7.26% | ||
593 : | fft 1.327 1.302 1.92% 112+209 115+210 1.055 0.962 9.71% | ||
594 : | knuthBendix 5.218 5.475 -4.69% 451+598 510+650 0.928 0.932 -0.36% | ||
595 : | lexgen 9.970 9.623 3.60% 1014+841 1157+885 0.947 0.928 1.97% | ||
596 : | life 1.183 1.183 0.00% 162+182 145+148 0.127 0.103 22.58% | ||
597 : | logic 3.285 3.512 -6.45% 514+684 591+836 5.682 5.577 1.88% | ||
598 : | mandelbrot 0.147 0.143 2.33% 38+41 33+54 0.703 0.690 1.93% | ||
599 : | mlyacc 35.457 32.763 8.22% 3496+4564 3611+4860 0.552 0.550 0.30% | ||
600 : | nucleic 7.100 6.888 3.07% 239+168 201+158 0.175 0.173 0.96% | ||
601 : | ratio-regions 6.388 6.843 -6.65% 1182+257 981+300 120.142 120.345 -0.17% | ||
602 : | ray 2.332 2.338 -0.29% 346+398 402+494 3.593 3.540 1.51% | ||
603 : | simple 9.912 9.903 0.08% 1475+941 1579+1168 3.057 3.178 -3.83% | ||
604 : | tsp 1.623 1.532 5.98% 266+200 250+211 8.045 7.878 2.12% | ||
605 : | vliw 33.947 35.470 -4.29% 2629+2774 2877+3171 2.072 1.890 9.61% | ||
606 : | ---------------------------------------------------------------------------- | ||
607 : | Average 1.22% 3.36% | ||
608 : | |||
609 : | ---------------------------------------------------------------------- | ||
610 : | Name: Allen Leung | ||
611 : | leunga | 583 | Date: 2000/03/23 16:25:00 |
612 : | Tag: leunga-20000323-fix_x86_alpha | ||
613 : | Description: | ||
614 : | |||
615 : | 1. X86 fixes/changes | ||
616 : | |||
617 : | a. The old code generated for SETcc was completely wrong. | ||
618 : | The Intel optimization guide is VERY misleading. | ||
619 : | |||
620 : | 2. ALPHA fixes/changes | ||
621 : | |||
622 : | a. Added the instructions LDBU, LDWU, STB, STW as per Fermin's suggestion. | ||
623 : | b. Added a new mode byteWordLoadStores to the functor parameter to Alpha() | ||
624 : | c. Added reassociation code for address computation. | ||
625 : | |||
626 : | ---------------------------------------------------------------------- | ||
627 : | Name: Allen Leung | ||
628 : | leunga | 580 | Date: 2000/03/22 01:23:00 |
629 : | Tag: leunga-20000322-fix_x86_hppa_ra | ||
630 : | Description: | ||
631 : | |||
632 : | 1. X86 fixes/changes | ||
633 : | |||
634 : | a. x86Rewrite bug with MUL3 (found by Lal) | ||
635 : | b. Added the instructions FSTS, FSTL | ||
636 : | |||
637 : | 2. PA-RISC fixes/changes | ||
638 : | |||
639 : | a. B label should not be a delay slot candidate! Why did this work? | ||
640 : | b. ADDT(32, REG(32, r), LI n) now generates one instruction instead of two, | ||
641 : | as it should be. | ||
642 : | c. The assembly syntax for fstds and fstdd was wrong. | ||
643 : | d. Added the composite instruction COMICLR/LDO, which is the immediate | ||
644 : | operand variant of COMCLR/LDO. | ||
645 : | |||
646 : | 3. Generic MLRISC | ||
647 : | |||
648 : | a. shuffle.sml rewritten to be slightly more efficient | ||
649 : | b. DIV bug in mltree-simplify fixed (found by Fermin) | ||
650 : | |||
651 : | 4. Register Allocator | ||
652 : | |||
653 : | a. I now release the interference graph earlier during spilling. | ||
654 : | May improve memory usage. | ||
655 : | |||
656 : | ---------------------------------------------------------------------- | ||
657 : | blume | 577 | Name: Matthias Blume |
658 : | blume | 578 | Date: 2000/03/14 14:15:32 |
659 : | Tag: blume_main_v110p26p1_2 | ||
660 : | Description: | ||
661 : | |||
662 : | 1. Tools.registerStdShellCmdTool (from smlnj/cm/tool.cm) takes an | ||
663 : | additional argument called "template" which is an optional string that | ||
664 : | specifiel the layout of the tool command line. See the CM manual for | ||
665 : | explanation. | ||
666 : | |||
667 : | 2. A special-purpose tool can be "regisitered" by simply dropping the | ||
668 : | corresponding <...>-tool.cm (and/or <...>-ext.cm) into the same | ||
669 : | directory where the .cm file lives that uses this tool. (The | ||
670 : | behavior/misfeature until now was to look for the tool description | ||
671 : | files in the current working directory.) As before, tool description | ||
672 : | files could also be anchored -- in which case they can live anywhere | ||
673 : | they like. Following the recent e-mail discussion, this change should | ||
674 : | make it easier to have special-purpose tools that are shipped together | ||
675 : | with the sources of the program that uses them. | ||
676 : | |||
677 : | ---------------------------------------------------------------------- | ||
678 : | Name: Matthias Blume | ||
679 : | blume | 577 | Date: 2000/03/10 07:48:34 |
680 : | Tag: blume_main_v110p26p1_1 | ||
681 : | Description: | ||
682 : | |||
683 : | I added a re-written version of Dave's fixpt script to src/system. | ||
684 : | Changes relative to the original version: | ||
685 : | - sh-ified (not everybody has ksh) | ||
686 : | - automatically figures out which architecture it runs on | ||
687 : | - uses ./makeml a bit more cleverly | ||
688 : | - never invokes ./installml (and, thus, does not clobber your | ||
689 : | good and working installation of sml in case something goes wrong) | ||
690 : | - accepts max iteration count using option "-iter <n>" | ||
691 : | - accepts a "base" name using option "-base <base>" | ||
692 : | |||
693 : | It does not build any extraneous heap images but directly rebuilds | ||
694 : | bin- and boot-hierarchies using makeml's "-rebuild" switch. Finally, | ||
695 : | it can incorporate existing bin- and boot- hierarchies. For example, | ||
696 : | suppose the base is set to "sml" (which is the default). Then it | ||
697 : | successively builds | ||
698 : | |||
699 : | sml.bin.<arch>-unix and sml.boot.<arch>-unix | ||
700 : | then sml1.bin.<arch>-unix and sml1.boot.<arch>-unix | ||
701 : | then sml2.bin.<arch>-unix and sml2.boot.<arch>-unix | ||
702 : | ... | ||
703 : | then sml<n>.bin.<arch>-unix and sml<n>.boot.<arch>-unix | ||
704 : | |||
705 : | and so on. If any of these already exist, it will just use what's | ||
706 : | there. In particular, many people will have the initial set of bin | ||
707 : | and boot files around, so this saves time for at least one full | ||
708 : | rebuild. Having sets of the form <base><k>.{bin,boot}.<arch>-unix for | ||
709 : | <k>=1,2,... is normally not a good idea when invoking fixpt. However, | ||
710 : | they might be the result of an earlier partial run of fixpt (which | ||
711 : | perhaps got accidentially killed). In this case, fixpt will quickly | ||
712 : | move through what exists before continuing where it left off earlier, | ||
713 : | and, thus, saves a lot of time. | ||
714 : | |||
715 : | ---------------------------------------------------------------------- | ||
716 : | leunga | 576 | Name: Allen Leung |
717 : | Date: 00/03/10 02:20:00 | ||
718 : | Tag: leunga-20000310-fix_x86_asm_ra | ||
719 : | Description: | ||
720 : | dbm | 570 | |
721 : | leunga | 576 | More assembly output problems involving the indexed addressing mode |
722 : | on the x86 have been found and corrected. Thanks to Fermin Reig for the | ||
723 : | fix. | ||
724 : | |||
725 : | The interface and implementation of the register allocator have been changed | ||
726 : | slightly to accommodate the possibility to skip the register allocation | ||
727 : | phases completely and go directly to memory allocation. This is needed | ||
728 : | for C-- use. | ||
729 : | |||
730 : | dbm | 570 | ---------------------------------------------------------------------- |
731 : | blume | 572 | Name: Matthias Blume |
732 : | blume | 575 | Date: 00/03/09 10:23:53 |
733 : | Tag: blume_main_v110p26p1_0 | ||
734 : | Description: | ||
735 : | |||
736 : | * Complete re-organization of library names. Many libraries have been | ||
737 : | consolidated so that they share the same path anchor. For example, | ||
738 : | all MLRISC-related libraries are anchored at MLRISC, most libraries that | ||
739 : | are SML/NJ-specific are under "smlnj". Notice that names like | ||
740 : | host-cmb.cm or host-compiler.cm no longer exist. See system/README | ||
741 : | for a complete description of the new naming scheme. Quick reference: | ||
742 : | |||
743 : | host-cmb.cm -> smlnj/cmb.cm | ||
744 : | host-compiler.cm -> smlnj/compiler.cm | ||
745 : | full-cm.cm -> smlnj/cm.cm | ||
746 : | <arch>-<os>.cm -> smlnj/cmb/<arch>-<os>.cm | ||
747 : | <arch>-compiler.cm -> smlnj/compiler/<arch>.cm | ||
748 : | |||
749 : | * Bug fixes in CM. | ||
750 : | - exceptions in user code are being passed through (i.e., reach top level) | ||
751 : | - more bugs in paranoia mode fixed | ||
752 : | - bug related to checking group owners fixed | ||
753 : | |||
754 : | * New install.sh script that automagically fetches archive files: | ||
755 : | The new file config/srcarchiveurl must contain the URL of the | ||
756 : | (remote) directory that contains bin files (or other source archives). | ||
757 : | If install.sh does not find the archive locally, it tries to get | ||
758 : | it from that remote directory. | ||
759 : | This should simplify installation further: For machines that have | ||
760 : | access to the internet, just fetch <version>-config.tgz, unpack it, | ||
761 : | edit config/targets, and go (run config/install.sh). The scipt will | ||
762 : | fetch everything else that it might need all by itself. | ||
763 : | |||
764 : | For CVS users, this mechanism is not relevant for source archives, but | ||
765 : | it is convenient for getting new sets of binfiles. | ||
766 : | |||
767 : | Archives should be tar files compressed with either gzip, compress, or | ||
768 : | bzip2. The script recognizes .tgz, .tar, tar.gz, tz, .tar.Z, and .tar.bz2. | ||
769 : | |||
770 : | ---------------------------------------------------------------------- | ||
771 : | Name: Matthias Blume | ||
772 : | blume | 572 | Date: 2000/03/07 04:01:04 |
773 : | Tag: blume_main_v110_26_2 | ||
774 : | dbm | 570 | Description: |
775 : | blume | 572 | - size info in BOOTLIST |
776 : | * no fixed upper limits for number of bootfiles or length of | ||
777 : | bootfile names in runtime | ||
778 : | * falling back to old behavior if no BOOTLIST size info found | ||
779 : | - allocation size heuristics in .run-sml | ||
780 : | * tries to read cache size from /proc/cpuinfo (this is important for | ||
781 : | small-cache Celeron systems!) | ||
782 : | - install.sh robustified | ||
783 : | - CM manual updates | ||
784 : | - paranoid mode | ||
785 : | * no more CMB.deliver() (i.e., all done by CMB.make()) | ||
786 : | * can re-use existing sml.boot.* files | ||
787 : | * init.cmi now treated as library | ||
788 : | * library stamps for consistency checks | ||
789 : | - sml.boot.<arch>-<os>/PIDMAP file | ||
790 : | * This file is read by the CM startup code. This is used to minimize | ||
791 : | the amount of dynamic state that needs to be stowed away for the | ||
792 : | purpose of sharing between interactive system and user code. | ||
793 : | - CM.Anchor.anchor instead of CM.Anchor.{set,cancel} | ||
794 : | * Upon request by Elsa. Anchors now controlled by get-set-pair | ||
795 : | like most other CM state variables. | ||
796 : | - Compiler.CMSA eliminated | ||
797 : | * No longer supported by CM anyway. | ||
798 : | - fixed bugs in pickler that kept biting Stefan | ||
799 : | * past refs to past refs (was caused by the possibility that | ||
800 : | ad-hoc sharing is more discriminating than hash-cons sharing) | ||
801 : | * integer overflow on LargeInt.minInt | ||
802 : | - ml-{lex,yacc} build scripts now use new mechanism | ||
803 : | for building standalone programs | ||
804 : | - fixed several gcc -Wall warnings that were caused by missing header | ||
805 : | files, missing initializations, etc., in runtime (not all warnings | ||
806 : | eliminated, though) |
root@smlnj-gforge.cs.uchicago.edu | ViewVC Help |
Powered by ViewVC 1.0.0 |