1 : |
dbm |
637 |
|
2 : |
|
|
S M L / N J
|
3 : |
|
|
|
4 : |
|
|
1 1 0 . 2 8 N E W S
|
5 : |
|
|
|
6 : |
|
|
April 10, 2000
|
7 : |
|
|
|
8 : |
|
|
WARNING
|
9 : |
|
|
|
10 : |
|
|
This version is intended for compiler hackers.
|
11 : |
|
|
We are in the midst of substantial structural changes,
|
12 : |
|
|
and this is a snapshot.
|
13 : |
|
|
|
14 : |
|
|
http://cm.bell-labs.com/cm/cs/what/smlnj/index.html
|
15 : |
|
|
|
16 : |
|
|
------------------------------------------------------------------------
|
17 : |
|
|
Summary:
|
18 : |
|
|
|
19 : |
|
|
* This version has some tuning of phase ordering for FLINT, which
|
20 : |
|
|
mostly solves the excessive space problem during compilation. With
|
21 : |
|
|
this tuning, 110.28 is slightly better than 110.25 in performance,
|
22 : |
|
|
but not quite as good as 110.25+latest MLRISC. See the benchmarks
|
23 : |
|
|
below. Futher work on the register allocator in MLRISC should
|
24 : |
|
|
improve space performance further.
|
25 : |
|
|
|
26 : |
|
|
* CM
|
27 : |
|
|
Installation scripts have been tweaked. A new noweb tool has been
|
28 : |
|
|
added. ".fun" suffix has been added to those that imply sml compilation.
|
29 : |
|
|
New version numbering scheme added. Alternative syntax for anchored
|
30 : |
|
|
paths added. Parallel make made smarter. Support for "unsharing".
|
31 : |
|
|
Simple "makedepend" functionality added. Further details below.
|
32 : |
|
|
|
33 : |
|
|
See src/system/README and the latest version of the CM manual at
|
34 : |
|
|
|
35 : |
|
|
<//http://www.kurims.kyoto-u.ac.jp/~blume/SMLNJ-DEV/manual/index.html>
|
36 : |
|
|
<//http://www.kurims.kyoto-u.ac.jp/~blume/SMLNJ-DEV/manual.ps>
|
37 : |
|
|
|
38 : |
|
|
for further information about user-visible changes.
|
39 : |
|
|
|
40 : |
|
|
* A few MLRISC changes, most not SML/NJ related.
|
41 : |
|
|
|
42 : |
|
|
* ML-Yacc: updated manual and examples to be SML '97, SML/NJ 110+ compliant.
|
43 : |
|
|
Examples now compiled with CM.
|
44 : |
|
|
|
45 : |
|
|
* Reported bug fixes (details in dist/smlnj/working/bugs.1400-1600):
|
46 : |
|
|
1498. Specialized real arrays not pretty-printed correctly
|
47 : |
|
|
1510. Signature matching bug makes "casts" possible
|
48 : |
|
|
1562. CM complains about unrecognized file extension ".fun"
|
49 : |
|
|
1563. redundant pathconfig contents with multiple builds
|
50 : |
|
|
|
51 : |
|
|
|
52 : |
|
|
======================================================================
|
53 : |
|
|
Details of changes
|
54 : |
|
|
|
55 : |
|
|
|
56 : |
|
|
======================================================================
|
57 : |
|
|
FLINT:
|
58 : |
|
|
======================================================================
|
59 : |
|
|
|
60 : |
|
|
Tuned ordering of phases to improve time/space performance. See
|
61 : |
|
|
src/compiler/FLINT/opt/flintopt.txt for description of some of the
|
62 : |
|
|
issues.
|
63 : |
|
|
|
64 : |
|
|
|
65 : |
|
|
======================================================================
|
66 : |
|
|
MLRISC:
|
67 : |
|
|
======================================================================
|
68 : |
|
|
|
69 : |
|
|
This update synchronizes my repository with Yale's. Most of these
|
70 : |
|
|
changes, however, do not affect SML/NJ at all (the RA is an exception).
|
71 : |
|
|
|
72 : |
|
|
1. Register Allocator
|
73 : |
|
|
|
74 : |
|
|
a. An improvement in the interference graph construction:
|
75 : |
|
|
Given a copy
|
76 : |
|
|
|
77 : |
|
|
s <- t
|
78 : |
|
|
|
79 : |
|
|
no interference edge between s and t is added for this definition of s.
|
80 : |
|
|
|
81 : |
|
|
b. I've added two new spill heuristic modules that Fermin and I developed
|
82 : |
|
|
(in the new library RA.cm). These are unused in SML/NJ but maybe
|
83 : |
|
|
useful for others (Moby?)
|
84 : |
|
|
|
85 : |
|
|
c. Added a flag "ra-dump-size" to print out the size of the flowgraph
|
86 : |
|
|
and the interference graph.
|
87 : |
|
|
|
88 : |
|
|
2. X86
|
89 : |
|
|
|
90 : |
|
|
a. Various fixes in the backend provided by Fermin [C--] and Lal.
|
91 : |
|
|
|
92 : |
|
|
3. Alpha
|
93 : |
|
|
|
94 : |
|
|
a. Added the BSR instruction and code generation that goes with it [C--]
|
95 : |
|
|
|
96 : |
|
|
b. Other fixes too numerous to recount provided by Fermin [C--]
|
97 : |
|
|
|
98 : |
|
|
c. PSEUDOARITH was missing in AlphaRewrite. This causes an endless
|
99 : |
|
|
loop in C--.
|
100 : |
|
|
|
101 : |
|
|
4. Regmaps
|
102 : |
|
|
|
103 : |
|
|
a. The regmaps are not initialized with the identity physical bindings
|
104 : |
|
|
at creation time. This is unneeded.
|
105 : |
|
|
|
106 : |
|
|
5. MLRISC Optimizations
|
107 : |
|
|
|
108 : |
|
|
a. The DJ-Graph module can now compute the iterated dominance frontiers
|
109 : |
|
|
intersects with liveness incrementally in linear time! Woohoo!
|
110 : |
|
|
This is now used in my new SSA construction algorithm.
|
111 : |
|
|
|
112 : |
|
|
b. THe branch reorganization module is now smarter about linear chains of
|
113 : |
|
|
basic blocks.
|
114 : |
|
|
|
115 : |
|
|
|
116 : |
|
|
======================================================================
|
117 : |
|
|
CM
|
118 : |
|
|
======================================================================
|
119 : |
|
|
|
120 : |
|
|
Several manual updates
|
121 : |
|
|
|
122 : |
|
|
I always try to keep the manual in sync with CM's latest features.
|
123 : |
|
|
|
124 : |
|
|
New noweb tool
|
125 : |
|
|
|
126 : |
|
|
The existing system is entirely unaffected by this, but some CM users
|
127 : |
|
|
have asked for renewed noweb support. Everything is documented in the
|
128 : |
|
|
CM manual.
|
129 : |
|
|
|
130 : |
|
|
New (plugin) libraries:
|
131 : |
|
|
|
132 : |
|
|
noweb-tool.cm
|
133 : |
|
|
nw-ext.cm
|
134 : |
|
|
|
135 : |
|
|
CM version numbering added
|
136 : |
|
|
|
137 : |
|
|
This is an implementation of Lal's proposal for adding version
|
138 : |
|
|
numbers and version checking to .cm files. Lal said that his
|
139 : |
|
|
proposal was just that -- a proposal. For the time being I went
|
140 : |
|
|
ahead and implemented it so that people can comment on it.
|
141 : |
|
|
Everything is completely backward-compatible (except for the stable
|
142 : |
|
|
library format, i.e., new bootfiles!).
|
143 : |
|
|
|
144 : |
|
|
As usual, see the CM manual for details.
|
145 : |
|
|
|
146 : |
|
|
Alternative syntax for anchored paths
|
147 : |
|
|
|
148 : |
|
|
Dave has recently voiced the same concerns that I had when I did
|
149 : |
|
|
this, so there should be some support. My take is that eventually
|
150 : |
|
|
I will let support for the current syntax (where anchors are
|
151 : |
|
|
"implicit") fade out in favor of the new, explicit syntax.
|
152 : |
|
|
In order to be backward-compatible, both old and new syntax are
|
153 : |
|
|
currently supported.
|
154 : |
|
|
|
155 : |
|
|
Again, see the CM manual for details.
|
156 : |
|
|
|
157 : |
|
|
Parallel make is trying to be slightly smarter
|
158 : |
|
|
|
159 : |
|
|
When the master process finds a "bottleneck", i.e., when there is
|
160 : |
|
|
only one compilation unit that can be compiled and everybody else
|
161 : |
|
|
is waiting on it, then it will simply compile it directly instead
|
162 : |
|
|
of clumsily telling one of the slaves to do it.
|
163 : |
|
|
|
164 : |
|
|
Support for "unsharing" added
|
165 : |
|
|
|
166 : |
|
|
This is necessary in order to be able to have two different
|
167 : |
|
|
versions of the same library running at the same time (e.g., for
|
168 : |
|
|
trying out a new MLRISC while still having the old MLRISC linked
|
169 : |
|
|
into the current compiler, etc.) See the CM manual.
|
170 : |
|
|
|
171 : |
|
|
makedepend
|
172 : |
|
|
|
173 : |
|
|
Simple "makedepend" functionality added for generating Makefile
|
174 : |
|
|
dependency information. (This is rather crude at the moment.
|
175 : |
|
|
Expect some changes here in the future.)
|
176 : |
|
|
|
177 : |
|
|
New sml class prefix
|
178 : |
|
|
|
179 : |
|
|
".fun" added as a recognized suffix for ML files. Also documented
|
180 : |
|
|
explicitly in the manual that the fallback behavior (unknown suffix
|
181 : |
|
|
-> ML file) is not an official feature!
|
182 : |
|
|
|
183 : |
|
|
Pickler
|
184 : |
|
|
|
185 : |
|
|
Small changes to the pickler for stable libraries.
|
186 : |
|
|
|
187 : |
|
|
Internal cleanup
|
188 : |
|
|
|
189 : |
|
|
Several internal changes to CM (for cleanup/improvement).
|
190 : |
|
|
|
191 : |
|
|
install.sh changes
|
192 : |
|
|
|
193 : |
|
|
I changed config/install.sh to remove duplicate entries from the
|
194 : |
|
|
lib/pathconfig file at the end. Moreover, the final version of
|
195 : |
|
|
lib/pathconfig is sorted alphabetically. The same (sorting) is done
|
196 : |
|
|
in src/system/installml.
|
197 : |
|
|
|
198 : |
|
|
The config/install.sh script now consistently uses relative
|
199 : |
|
|
pathnames in lib/pathconfig whenever the anchor is in the lib
|
200 : |
|
|
directory. (So far this was true for the libraries that come
|
201 : |
|
|
pre-compiled and bundled as part of the bootfiles but not for
|
202 : |
|
|
libraries that are compiled by the script itself.)
|
203 : |
|
|
|
204 : |
|
|
Changed install.sh script to handle archive files without version number
|
205 : |
|
|
and to use "boot.<arch>-<os>" instead of "sml.boot.<arch>-<os>" for the
|
206 : |
|
|
name of the boot file archive.
|
207 : |
|
|
|
208 : |
|
|
|
209 : |
|
|
======================================================================
|
210 : |
|
|
BENCHMARKS
|
211 : |
|
|
======================================================================
|
212 : |
|
|
Allen Leung, 28 April 2000:
|
213 : |
|
|
|
214 : |
|
|
I've rerun the benchmarks to see if anything has slowed down in
|
215 : |
|
|
MLRISC recently, but found nothing. I compared 110.25, 110.25+latest
|
216 : |
|
|
MLRISC, and 110.27+ (which also has the latest MLRISC). 110.25 uses
|
217 : |
|
|
cpsopt, and 110.27+ uses flintopt. [110.27+ is repository state
|
218 : |
|
|
at the end of April, which is essentially 110.28 - dbm].
|
219 : |
|
|
|
220 : |
|
|
The results are:
|
221 : |
|
|
|
222 : |
|
|
110.25 versus 110.25+latest MLRISC
|
223 : |
|
|
----------------------------------
|
224 : |
|
|
Name Compilation Runtime Speedup
|
225 : |
|
|
barnesHut 5.560 5.368 3.57% 3.690 3.302 11.76%
|
226 : |
|
|
boyer 8.678 7.672 13.12% 0.313 0.312 0.53%
|
227 : |
|
|
count-graphs 2.048 1.915 6.96% 28.577 27.128 5.34%
|
228 : |
|
|
fft 1.325 1.233 7.43% 1.060 0.982 7.98%
|
229 : |
|
|
knuthBendix 5.337 4.680 14.03% 0.962 0.918 4.72%
|
230 : |
|
|
lexgen 10.310 9.950 3.62% 0.917 0.902 1.66%
|
231 : |
|
|
life 1.230 1.127 9.17% 0.128 0.100 28.33%
|
232 : |
|
|
logic 3.533 3.258 8.44% 5.750 5.413 6.22%
|
233 : |
|
|
mandelbrot 0.145 0.148 -2.25% 0.700 0.685 2.19%
|
234 : |
|
|
mlyacc 34.557 33.342 3.64% 0.553 0.533 3.75%
|
235 : |
|
|
nucleic 6.675 6.507 2.59% 0.173 0.167 4.00%
|
236 : |
|
|
ratio-regions 6.358 6.218 2.25% 119.753 120.772 -0.84%
|
237 : |
|
|
ray 2.280 2.247 1.48% 3.563 3.517 1.33%
|
238 : |
|
|
simple 9.798 9.650 1.54% 2.987 3.083 -3.14%
|
239 : |
|
|
tsp 1.763 1.528 15.38% 8.657 7.718 12.16%
|
240 : |
|
|
vliw 33.938 32.570 4.20% 1.982 2.000 -0.92%
|
241 : |
|
|
Average speedup: 5.32%
|
242 : |
|
|
Average compile time speedup: 5.95%
|
243 : |
|
|
|
244 : |
|
|
As you can see, MLRISC on x86 has improved slightly since the
|
245 : |
|
|
110.25. Now comparing 110.25 with 110.27+:
|
246 : |
|
|
|
247 : |
|
|
110.25 versus 110.27+
|
248 : |
|
|
---------------------
|
249 : |
|
|
Name Compilation Runtime Speedup
|
250 : |
|
|
barnesHut 5.560 5.568 -0.15% 3.690 3.347 10.26%
|
251 : |
|
|
boyer 8.678 10.058 -13.72% 0.313 0.322 -2.59%
|
252 : |
|
|
count-graphs 2.048 2.125 -3.61% 28.577 31.533 -9.38%
|
253 : |
|
|
fft 1.325 1.297 2.19% 1.060 1.052 0.79%
|
254 : |
|
|
knuthBendix 5.337 3.737 42.82% 0.962 1.175 -18.16%
|
255 : |
|
|
lexgen 10.310 9.985 3.25% 0.917 0.992 -7.56%
|
256 : |
|
|
life 1.230 1.105 11.31% 0.128 0.125 2.67%
|
257 : |
|
|
logic 3.533 3.302 7.02% 5.750 5.357 7.34%
|
258 : |
|
|
mandelbrot 0.145 0.162 -10.31% 0.700 0.753 -7.08%
|
259 : |
|
|
mlyacc 34.557 37.292 -7.33% 0.553 0.493 12.16%
|
260 : |
|
|
nucleic 6.675 7.717 -13.50% 0.173 0.165 5.05%
|
261 : |
|
|
ratio-regions 6.358 3.915 62.41% 119.753 126.153 -5.07%
|
262 : |
|
|
ray 2.280 2.142 6.46% 3.563 2.827 26.06%
|
263 : |
|
|
simple 9.798 10.067 -2.67% 2.987 2.812 6.22%
|
264 : |
|
|
tsp 1.763 1.605 9.87% 8.657 8.870 -2.41%
|
265 : |
|
|
vliw 33.938 46.907 -27.65% 1.982 1.802 9.99%
|
266 : |
|
|
Average speedup: 1.77%
|
267 : |
|
|
Average compile time speedup: 4.15%
|
268 : |
|
|
|
269 : |
|
|
110.25+latest MLRISC versus 110.27+
|
270 : |
|
|
-----------------------------------
|
271 : |
|
|
Name Compilation Runtime Speedup
|
272 : |
|
|
barnesHut 5.368 5.568 -3.59% 3.302 3.347 -1.34%
|
273 : |
|
|
boyer 7.672 10.058 -23.73% 0.312 0.322 -3.11%
|
274 : |
|
|
count-graphs 1.915 2.125 -9.88% 27.128 31.533 -13.97%
|
275 : |
|
|
fft 1.233 1.297 -4.88% 0.982 1.052 -6.66%
|
276 : |
|
|
knuthBendix 4.680 3.737 25.25% 0.918 1.175 -21.84%
|
277 : |
|
|
lexgen 9.950 9.985 -0.35% 0.902 0.992 -9.08%
|
278 : |
|
|
life 1.127 1.105 1.96% 0.100 0.125 -20.00%
|
279 : |
|
|
logic 3.258 3.302 -1.31% 5.413 5.357 1.06%
|
280 : |
|
|
mandelbrot 0.148 0.162 -8.25% 0.685 0.753 -9.07%
|
281 : |
|
|
mlyacc 33.342 37.292 -10.59% 0.533 0.493 8.11%
|
282 : |
|
|
nucleic 6.507 7.717 -15.68% 0.167 0.165 1.01%
|
283 : |
|
|
ratio-regions 6.218 3.915 58.83% 120.772 126.153 -4.27%
|
284 : |
|
|
ray 2.247 2.142 4.90% 3.517 2.827 24.41%
|
285 : |
|
|
simple 9.650 10.067 -4.14% 3.083 2.812 9.66%
|
286 : |
|
|
tsp 1.528 1.605 -4.78% 7.718 8.870 -12.98%
|
287 : |
|
|
vliw 32.570 46.907 -30.56% 2.000 1.802 11.01%
|
288 : |
|
|
Average speedup: -2.94%
|
289 : |
|
|
Average compile time speedup: -1.68%
|
290 : |
|
|
|
291 : |
|
|
Overall, I'd say 110.27+ is pretty competitive with 110.25. There are
|
292 : |
|
|
some big improvements (ray, vliw, mlyacc) which can only be attributed to FLINT
|
293 : |
|
|
changes. But something has slowed down (tsp, knuth-bendix, life, mandelbrot,
|
294 : |
|
|
count-graphs). [We know the reasons with knuth-bendix.]
|
295 : |
|
|
Also, compilation time in 110.27+ is generally slower (which is offset
|
296 : |
|
|
by huge improvements in knuth-bendix and ratio-regions).
|