Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Annotation of /sml/branches/SMLNJ/src/MLRISC/alpha/mltree/alpha.sml
ViewVC logotype

Annotation of /sml/branches/SMLNJ/src/MLRISC/alpha/mltree/alpha.sml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 475 - (view) (download)

1 : monnier 409 (*
2 :     * This is a revamping of the Alpha32 instruction selection module
3 :     * using the new MLTREE and instruction representation. I've dropped
4 :     * the suffix 32 since we now support 64 bit datatypes.
5 :     *
6 :     * -- Allen
7 :     *
8 :     * Notes: places with optimizations are marked ***OPT***
9 :     *)
10 :    
11 :     functor Alpha
12 :     (structure AlphaInstr : ALPHAINSTR
13 :     structure AlphaMLTree : MLTREE
14 :     structure PseudoInstrs : ALPHA_PSEUDO_INSTR
15 : monnier 475 sharing AlphaMLTree.Region = AlphaInstr.Region
16 :     sharing AlphaMLTree.Constant = AlphaInstr.Constant
17 :     sharing PseudoInstrs.I = AlphaInstr
18 :     sharing PseudoInstrs.T = AlphaMLTree
19 : monnier 409
20 :     (* Cost of multiplication in cycles *)
21 :     val multCost : int ref
22 :    
23 :     (* Should we just use the native multiply by a constant? *)
24 :     val useMultByConst : bool ref
25 :     ) : MLTREECOMP =
26 :     struct
27 :    
28 :     structure T = AlphaMLTree
29 : monnier 429 structure S = T.Stream
30 : monnier 409 structure R = AlphaMLTree.Region
31 :     structure I = AlphaInstr
32 :     structure C = AlphaInstr.C
33 :     structure LE = LabelExp
34 :     structure W32 = Word32
35 : monnier 429 structure P = PseudoInstrs
36 : monnier 409
37 :     (*********************************************************
38 :    
39 :     Trap Shadows, Floating Exceptions, and Denormalized
40 :     Numbers on the DEC Alpha
41 :    
42 :     Andrew W. Appel and Lal George
43 :     Nov 28, 1995
44 :    
45 :     See section 4.7.5.1 of the Alpha Architecture Reference Manual.
46 :    
47 :     The Alpha has imprecise exceptions, meaning that if a floating
48 :     point instruction raises an IEEE exception, the exception may
49 :     not interrupt the processor until several successive instructions have
50 :     completed. ML, on the other hand, may want a "precise" model
51 :     of floating point exceptions.
52 :    
53 :     Furthermore, the Alpha hardware does not support denormalized numbers
54 :     (for "gradual underflow"). Instead, underflow always rounds to zero.
55 :     However, each floating operation (add, mult, etc.) has a trapping
56 :     variant that will raise an exception (imprecisely, of course) on
57 :     underflow; in that case, the instruction will produce a zero result
58 :     AND an exception will occur. In fact, there are several variants
59 :     of each instruction; three variants of MULT are:
60 :    
61 :     MULT s1,s2,d truncate denormalized result to zero; no exception
62 :     MULT/U s1,s2,d truncate denormalized result to zero; raise UNDERFLOW
63 :     MULT/SU s1,s2,d software completion, producing denormalized result
64 :    
65 :     The hardware treats the MULT/U and MULT/SU instructions identically,
66 :     truncating a denormalized result to zero and raising the UNDERFLOW
67 :     exception. But the operating system, on an UNDERFLOW exception,
68 :     examines the faulting instruction to see if it's an /SU form, and if so,
69 :     recalculates s1*s2, puts the right answer in d, and continues,
70 :     all without invoking the user's signal handler.
71 :    
72 :     Because most machines compute with denormalized numbers in hardware,
73 :     to maximize portability of SML programs, we use the MULT/SU form.
74 :     (and ADD/SU, SUB/SU, etc.) But to use this form successfully,
75 :     certain rules have to be followed. Basically, d cannot be the same
76 :     register as s1 or s2, because the opsys needs to be able to
77 :     recalculate the operation using the original contents of s1 and s2,
78 :     and the MULT/SU instruction will overwrite d even if it traps.
79 :    
80 :     More generally, we may want to have a sequence of floating-point
81 :     instructions. The rules for such a sequence are:
82 :    
83 :     1. The sequence should end with a TRAPB (trap barrier) instruction.
84 :     (This could be relaxed somewhat, but certainly a TRAPB would
85 :     be a good idea sometime before the next branch instruction or
86 :     update of an ML reference variable, or any other ML side effect.)
87 :     2. No instruction in the sequence should destroy any operand of itself
88 :     or of any previous instruction in the sequence.
89 :     3. No two instructions in the sequence should write the same destination
90 :     register.
91 :    
92 :     We can achieve these conditions by the following trick in the
93 :     Alpha code generator. Each instruction in the sequence will write
94 :     to a different temporary; this is guaranteed by the translation from
95 :     ML-RISC. At the beginning of the sequence, we will put a special
96 :     pseudo-instruction (we call it DEFFREG) that "defines" the destination
97 :     register of the arithmetic instruction. If there are K arithmetic
98 :     instructions in the sequence, then we'll insert K DEFFREG instructions
99 :     all at the beginning of the sequence.
100 :     Then, each arithop will not only "define" its destination temporary
101 :     but will "use" it as well. When all these instructions are fed to
102 :     the liveness analyzer, the resulting interference graph will then
103 :     have inteference edges satisfying conditions 2 and 3 above.
104 :    
105 :     Of course, DEFFREG doesn't actually generate any code. In our model
106 :     of the Alpha, every instruction generates exactly 4 bytes of code
107 :     except the "span-dependent" ones. Therefore, we'll specify DEFFREG
108 :     as a span-dependent instruction whose minimum and maximum sizes are zero.
109 :    
110 :     At the moment, we do not group arithmetic operations into sequences;
111 :     that is, each arithop will be preceded by a single DEFFREG and
112 :     followed by a TRAPB. To avoid the cost of all those TRAPB's, we
113 :     should improve this when we have time. Warning: Don't put more
114 :     than 31 instructions in the sequence, because they're all required
115 :     to write to different destination registers!
116 :    
117 :     What about multiple traps? For example, suppose a sequence of
118 :     instructions produces an Overflow and a Divide-by-Zero exception?
119 :     ML would like to know only about the earliest trap, but the hardware
120 :     will report BOTH traps to the operating system. However, as long
121 :     as the rules above are followed (and the software-completion versions
122 :     of the arithmetic instructions are used), the operating system will
123 :     have enough information to know which instruction produced the
124 :     trap. It is very probable that the operating system will report ONLY
125 :     the earlier trap to the user process, but I'm not sure.
126 :    
127 :     For a hint about what the operating system is doing in its own
128 :     trap-handler (with software completion), see section 6.3.2 of
129 :     "OpenVMS Alpha Software" (Part II of the Alpha Architecture
130 :     Manual). This stuff should apply to Unix (OSF1) as well as VMS.
131 :    
132 :     ****************************************************************)
133 :    
134 :     fun error msg = MLRiscErrorMsg.error("Alpha",msg)
135 :    
136 :    
137 :     (*
138 :     * This module is used to simulate operations of non-standard widths.
139 :     *)
140 :     structure Gen = MLTreeGen(structure T = T
141 :     val intTy = 64
142 :     val naturalWidths = [32,64]
143 : monnier 429 datatype rep = SE | ZE | NEITHER
144 :     val rep = SE
145 : monnier 409 )
146 :    
147 :     val zeroR = C.GPReg 31
148 :     val zeroOpn = I.REGop zeroR
149 :    
150 :    
151 :     (*
152 :     * Specialize the modules for multiplication/division
153 :     * by constant optimizations.
154 :     *)
155 :     functor Multiply32 = MLTreeMult
156 :     (structure I = I
157 :     structure T = T
158 :    
159 :     val intTy = 32
160 :    
161 : monnier 429 type arg = {r1:C.cell,r2:C.cell,d:C.cell}
162 :     type argi = {r:C.cell,i:int,d:C.cell}
163 : monnier 409
164 :     fun mov{r,d} = I.COPY{dst=[d],src=[r],tmp=NONE,impl=ref NONE}
165 :     fun add{r1,r2,d} = I.OPERATE{oper=I.ADDL,ra=r1,rb=I.REGop r2,rc=d}
166 :     (*
167 :     * How to left shift by a constant (32bits)
168 :     *)
169 :     fun slli{r,i=1,d} = [I.OPERATE{oper=I.ADDL,ra=r,rb=I.REGop r,rc=d}]
170 :     | slli{r,i=2,d} = [I.OPERATE{oper=I.S4ADDL,ra=r,rb=zeroOpn,rc=d}]
171 :     | slli{r,i=3,d} = [I.OPERATE{oper=I.S8ADDL,ra=r,rb=zeroOpn,rc=d}]
172 :     | slli{r,i,d} =
173 :     let val tmp = C.newReg()
174 :     in [I.OPERATE{oper=I.SLL,ra=r,rb=I.IMMop i,rc=tmp},
175 :     I.OPERATE{oper=I.SGNXL,ra=tmp,rb=zeroOpn,rc=d}]
176 :     end
177 :    
178 :     (*
179 :     * How to right shift (unsigned) by a constant (32bits)
180 :     *)
181 :     fun srli{r,i,d} =
182 :     let val tmp = C.newReg()
183 :     in [I.OPERATE{oper=I.ZAP,ra=r,rb=I.IMMop 0xf0,rc=tmp},
184 :     I.OPERATE{oper=I.SRL,ra=tmp,rb=I.IMMop i,rc=d}]
185 :     end
186 :    
187 :     (*
188 :     * How to right shift (signed) by a constant (32bits)
189 :     *)
190 :     fun srai{r,i,d} =
191 :     let val tmp = C.newReg()
192 :     in [I.OPERATE{oper=I.SGNXL,ra=r,rb=zeroOpn,rc=tmp},
193 :     I.OPERATE{oper=I.SRA,ra=tmp,rb=I.IMMop i,rc=d}]
194 :     end
195 :     )
196 :    
197 :     functor Multiply64 = MLTreeMult
198 :     (structure I = I
199 :     structure T = T
200 :    
201 :     val intTy = 64
202 :    
203 : monnier 429 type arg = {r1:C.cell,r2:C.cell,d:C.cell}
204 :     type argi = {r:C.cell,i:int,d:C.cell}
205 : monnier 409
206 :     fun mov{r,d} = I.COPY{dst=[d],src=[r],tmp=NONE,impl=ref NONE}
207 :     fun add{r1,r2,d}= I.OPERATE{oper=I.ADDQ,ra=r1,rb=I.REGop r2,rc=d}
208 :     fun slli{r,i,d} = [I.OPERATE{oper=I.SLL,ra=r,rb=I.IMMop i,rc=d}]
209 :     fun srli{r,i,d} = [I.OPERATE{oper=I.SRL,ra=r,rb=I.IMMop i,rc=d}]
210 :     fun srai{r,i,d} = [I.OPERATE{oper=I.SRA,ra=r,rb=I.IMMop i,rc=d}]
211 :     )
212 :    
213 :     (* signed, trapping version of multiply and divide *)
214 :     structure Mult32 = Multiply32
215 :     (val trapping = true
216 :     val multCost = multCost
217 :     fun addv{r1,r2,d} = [I.OPERATEV{oper=I.ADDLV,ra=r1,rb=I.REGop r2,rc=d}]
218 :     fun subv{r1,r2,d} = [I.OPERATEV{oper=I.SUBLV,ra=r1,rb=I.REGop r2,rc=d}]
219 :     val sh1addv = NONE
220 :     val sh2addv = NONE
221 :     val sh3addv = NONE
222 :     )
223 : monnier 429 (val signed = true)
224 : monnier 409
225 : monnier 429 (* non-trapping version of multiply and divide *)
226 :     functor Mul32 = Multiply32
227 : monnier 409 (val trapping = false
228 :     val multCost = multCost
229 :     fun addv{r1,r2,d} = [I.OPERATE{oper=I.ADDL,ra=r1,rb=I.REGop r2,rc=d}]
230 :     fun subv{r1,r2,d} = [I.OPERATE{oper=I.SUBL,ra=r1,rb=I.REGop r2,rc=d}]
231 :     val sh1addv = NONE
232 :     val sh2addv = SOME(fn {r1,r2,d} =>
233 :     [I.OPERATE{oper=I.S4ADDL,ra=r1,rb=I.REGop r2,rc=d}])
234 :     val sh3addv = SOME(fn {r1,r2,d} =>
235 :     [I.OPERATE{oper=I.S8ADDL,ra=r1,rb=I.REGop r2,rc=d}])
236 :     )
237 : monnier 429 structure Mulu32 = Mul32(val signed = false)
238 :     structure Muls32 = Mul32(val signed = true)
239 : monnier 409
240 :     (* signed, trapping version of multiply and divide *)
241 :     structure Mult64 = Multiply64
242 :     (val trapping = true
243 :     val multCost = multCost
244 :     fun addv{r1,r2,d} = [I.OPERATEV{oper=I.ADDQV,ra=r1,rb=I.REGop r2,rc=d}]
245 :     fun subv{r1,r2,d} = [I.OPERATEV{oper=I.SUBQV,ra=r1,rb=I.REGop r2,rc=d}]
246 :     val sh1addv = NONE
247 :     val sh2addv = NONE
248 :     val sh3addv = NONE
249 :     )
250 : monnier 429 (val signed = true)
251 : monnier 409
252 :     (* unsigned, non-trapping version of multiply and divide *)
253 : monnier 429 functor Mul64 = Multiply64
254 : monnier 409 (val trapping = false
255 :     val multCost = multCost
256 :     fun addv{r1,r2,d} = [I.OPERATE{oper=I.ADDQ,ra=r1,rb=I.REGop r2,rc=d}]
257 :     fun subv{r1,r2,d} = [I.OPERATE{oper=I.SUBQ,ra=r1,rb=I.REGop r2,rc=d}]
258 :     val sh1addv = NONE
259 :     val sh2addv = SOME(fn {r1,r2,d} =>
260 :     [I.OPERATE{oper=I.S4ADDQ,ra=r1,rb=I.REGop r2,rc=d}])
261 :     val sh3addv = SOME(fn {r1,r2,d} =>
262 :     [I.OPERATE{oper=I.S8ADDQ,ra=r1,rb=I.REGop r2,rc=d}])
263 :     )
264 : monnier 429 structure Mulu64 = Mul64(val signed = false)
265 :     structure Muls64 = Mul64(val signed = true)
266 : monnier 409
267 :     (*
268 :     * The main stuff
269 :     *)
270 :    
271 :     datatype times4or8 = TIMES1
272 :     | TIMES4
273 :     | TIMES8
274 :     datatype zeroOne = ZERO | ONE | OTHER
275 :     datatype commutative = COMMUTE | NOCOMMUTE
276 :    
277 :     fun selectInstructions
278 : monnier 429 (S.STREAM{emit,beginCluster,endCluster,
279 :     defineLabel,entryLabel,pseudoOp,annotation,
280 : monnier 469 exitBlock,phi,alias,comment,...}) =
281 : monnier 409 let
282 :     infix || && << >> ~>>
283 :    
284 :     val op || = W32.orb
285 :     val op && = W32.andb
286 :     val op << = W32.<<
287 :     val op >> = W32.>>
288 :     val op ~>> = W32.~>>
289 :    
290 :     val itow = Word.fromInt
291 :     val wtoi = Word.toIntX
292 :    
293 :     val zeroFR = C.FPReg 31
294 :     val zeroEA = I.Direct zeroR
295 :     val zeroT = T.LI 0
296 :    
297 :     val newReg = C.newReg
298 :     val newFreg = C.newFreg
299 :    
300 :     val trapb = [I.TRAPB]
301 :    
302 :     (* Choose the appropriate rounding mode to generate.
303 :     * This stuff is used to support the alpha32x SML/NJ backend.
304 : monnier 429 *
305 :     *
306 :     * Floating point rounding mode.
307 :     * When this is set to true, we use the /SU rounding mode
308 :     * (chopped towards zero) for floating point arithmetic.
309 :     * This flag is only used to support the old alpha32x backend.
310 :     *
311 :     * Otherwise, we use /SUD. This is the default for SML/NJ.
312 :     *
313 : monnier 409 *)
314 : monnier 429 val useSU = false
315 : monnier 409 val (ADDT,SUBT,MULT,DIVT) =
316 :     if useSU then (I.ADDTSU,I.SUBTSU,I.MULTSU,I.DIVTSU)
317 :     else (I.ADDTSUD,I.SUBTSUD,I.MULTSUD,I.DIVTSUD)
318 :     val (ADDS,SUBS,MULS,DIVS) =
319 :     if useSU then (I.ADDSSU,I.SUBSSU,I.MULSSU,I.DIVSSU)
320 :     else (I.ADDSSUD,I.SUBSSUD,I.MULSSUD,I.DIVSSUD)
321 :    
322 :     fun mark'(i,[]) = i
323 :     | mark'(i,a::an) = mark'(I.ANNOTATION{i=i,a=a},an)
324 :     fun mark(i,an) = emit(mark'(i,an))
325 :    
326 :     (* Fit within 16 bits? *)
327 :     fun literal16 n = ~32768 <= n andalso n < 32768
328 :     fun literal16w w =
329 :     let val hi = W32.~>>(w,0wx16)
330 :     in hi = 0w0 orelse (W32.notb hi) = 0w0 end
331 :    
332 :     (* emit an LDA instruction; return the register that holds the value *)
333 :     fun lda(base,I.IMMop 0) = base
334 :     | lda(base,offset) =
335 :     let val r = newReg()
336 :     in emit(I.LDA{r=r, b=base, d=offset}); r end
337 :    
338 :     (* emit load immed *)
339 :     fun loadImmed(n, base, rd, an) =
340 : monnier 469 if n =0 then
341 :     move(base, rd, an)
342 :     else if ~32768 <= n andalso n < 32768 then
343 : monnier 409 mark(I.LDA{r=rd, b=base, d=I.IMMop n},an)
344 :     else
345 :     let val w = itow n
346 :     val hi = Word.~>>(w, 0w16)
347 :     val lo = Word.andb(w, 0w65535)
348 :     val (hi', lo') =
349 :     if lo < 0w32768 then (hi, lo) else (hi+0w1, lo-0w65536)
350 :     val t = lda(base,I.IMMop(wtoi lo'))
351 :     in mark(I.LDAH{r=rd, b=t, d=I.IMMop(wtoi hi')},an)
352 :     end
353 :    
354 :     (* loadImmed32 is used to load int32 and word32 constants.
355 :     * In either case we sign extend the 32-bit value. This is compatible
356 :     * with LDL which sign extends a 32-bit valued memory location.
357 :     *)
358 : monnier 469 and loadImmed32(0w0, base, rd, an) =
359 :     move(base, rd, an)
360 : monnier 409 | loadImmed32(n, base, rd, an) = let
361 :     val low = W32.andb(n, 0w65535) (* unsigned (0 .. 65535) *)
362 :     val high = W32.~>>(n, 0w16) (* signed (~32768 .. 32768] *)
363 :     fun loadimmed(0, high) =
364 :     mark(I.LDAH{r=rd, b=base, d=I.IMMop(high)},an)
365 :     | loadimmed(low, 0) =
366 :     mark(I.LDA{r=rd, b=base, d=I.IMMop(low)},an)
367 :     | loadimmed(low, high) =
368 :     (emit(I.LDA{r=rd, b=base, d=I.IMMop(low)});
369 :     mark(I.LDAH{r=rd, b=rd, d=I.IMMop(high)},an))
370 :     in
371 :     if W32.<(low, 0w32768) then
372 :     loadimmed(W32.toInt low, W32.toIntX high)
373 :     else let (* low = (32768 .. 65535) *)
374 :     val lowsgn = W32.-(low, 0w65536) (* signed (~1 .. ~32768) *)
375 :     val highsgn = W32.+(high, 0w1) (* (~32768 .. 32768) *)
376 :     val ilow = W32.toIntX lowsgn
377 :     val ihigh = W32.toIntX highsgn
378 :     in
379 :     if ihigh <> 32768 then loadimmed(ilow, ihigh)
380 :     else
381 :     let val tmpR1 = newReg()
382 :     val tmpR2 = newReg()
383 : monnier 469 val tmpR3 = newReg()
384 : monnier 409 in
385 :     (* you gotta do what you gotta do! *)
386 : monnier 469 emit(I.LDA{r=tmpR3, b=base, d=I.IMMop(ilow)});
387 :     emit(I.OPERATE{oper=I.ADDQ, ra=zeroR, rb=I.IMMop 1, rc=tmpR1});
388 : monnier 409 emit(I.OPERATE{oper=I.SLL, ra=tmpR1, rb=I.IMMop 31, rc=tmpR2});
389 : monnier 469 mark(I.OPERATE{oper=I.ADDQ, ra=tmpR2, rb=I.REGop tmpR3,
390 :     rc=rd},an)
391 : monnier 409 end
392 :     end
393 :     end
394 :    
395 :     (* emit load immed *)
396 : monnier 469 and loadConst(c,d,an) = mark(I.LDA{r=d,b=zeroR,d=I.CONSTop c},an)
397 : monnier 409
398 :     (* emit load label *)
399 : monnier 469 and loadLabel(l,d,an) = mark(I.LDA{r=d,b=zeroR,d=I.LABop l},an)
400 : monnier 409
401 :     (* emit a copy *)
402 : monnier 469 and copy(dst,src,an) =
403 : monnier 409 mark(I.COPY{dst=dst,src=src,impl=ref NONE,
404 :     tmp=case dst of
405 :     [_] => NONE | _ => SOME(I.Direct(newReg()))},an)
406 :    
407 :     (* emit a floating point copy *)
408 : monnier 469 and fcopy(dst,src,an) =
409 : monnier 409 mark(I.FCOPY{dst=dst,src=src,impl=ref NONE,
410 :     tmp=case dst of
411 :     [_] => NONE | _ => SOME(I.FDirect(newFreg()))},an)
412 :    
413 : monnier 469 and move(s,d,an) =
414 : monnier 409 if s = d orelse d = zeroR then () else
415 :     mark(I.COPY{dst=[d],src=[s],impl=ref NONE,tmp=NONE},an)
416 :    
417 : monnier 469 and fmove(s,d,an) =
418 : monnier 409 if s = d orelse d = zeroFR then () else
419 :     mark(I.FCOPY{dst=[d],src=[s],impl=ref NONE,tmp=NONE},an)
420 :    
421 :     (* emit an sign extension op *)
422 : monnier 469 and signExt32(r,d) =
423 : monnier 409 emit(I.OPERATE{oper=I.SGNXL,ra=r,rb=zeroOpn,rc=d})
424 :    
425 :     (* emit an commutative arithmetic op *)
426 : monnier 469 and commArith(opcode,a,b,d,an) =
427 : monnier 409 case (opn a,opn b) of
428 :     (I.REGop r,i) => mark(I.OPERATE{oper=opcode,ra=r,rb=i,rc=d},an)
429 :     | (i,I.REGop r) => mark(I.OPERATE{oper=opcode,ra=r,rb=i,rc=d},an)
430 :     | (r,i) => mark(I.OPERATE{oper=opcode,ra=reduceOpn r,rb=i,rc=d},an)
431 :    
432 :     (* emit an arithmetic op *)
433 :     and arith(opcode,a,b,d,an) =
434 :     mark(I.OPERATE{oper=opcode,ra=expr a,rb=opn b,rc=d},an)
435 :     and arith'(opcode,a,b,d,an) =
436 :     let val rb = opn b
437 :     val ra = expr a
438 :     in mark(I.OPERATE{oper=opcode,ra=ra,rb=rb,rc=d},an) end
439 :    
440 :     (* emit a trapping commutative arithmetic op *)
441 :     and commArithTrap(opcode,a,b,d,an) =
442 :     (case (opn a,opn b) of
443 :     (I.REGop r,i) => mark(I.OPERATEV{oper=opcode,ra=r,rb=i,rc=d},an)
444 :     | (i,I.REGop r) => mark(I.OPERATEV{oper=opcode,ra=r,rb=i,rc=d},an)
445 :     | (r,i) => mark(I.OPERATEV{oper=opcode,ra=reduceOpn r,rb=i,rc=d},an);
446 :     emit(I.TRAPB)
447 :     )
448 :    
449 :     (* emit a trapping arithmetic op *)
450 :     and arithTrap(opcode,a,b,d,an) =
451 :     (mark(I.OPERATEV{oper=opcode,ra=expr a,rb=opn b,rc=d},an);
452 :     emit(I.TRAPB)
453 :     )
454 :    
455 :     (* convert an operand into a register *)
456 :     and reduceOpn(I.REGop r) = r
457 :     | reduceOpn(I.IMMop 0) = zeroR
458 :     | reduceOpn opn =
459 :     let val d = newReg()
460 :     in emit(I.OPERATE{oper=I.BIS,ra=zeroR,rb=opn,rc=d}); d end
461 :    
462 :     (* convert an expression into an operand *)
463 :     and opn(T.REG(_,r)) = I.REGop r
464 :     | opn(e as T.LI n) =
465 :     if n <= 255 andalso n >= 0 then I.IMMop n
466 :     else let val tmpR = newReg()
467 :     in loadImmed(n,zeroR,tmpR,[]); I.REGop tmpR end
468 :     | opn(e as T.LI32 w) =
469 :     if w <= 0w255 then I.IMMop(W32.toIntX w)
470 :     else let val tmpR = newReg()
471 :     in loadImmed32(w,zeroR,tmpR,[]); I.REGop tmpR end
472 :     | opn(T.CONST c) = I.CONSTop c
473 :     | opn e = I.REGop(expr e)
474 :    
475 :     (* compute base+displacement from an expression *)
476 :     and addr exp =
477 :     case exp of
478 :     T.ADD(_,e,T.LI n) => makeEA(expr e,n)
479 :     | T.ADD(_,T.LI n,e) => makeEA(expr e,n)
480 :     | T.ADD(_,e,T.CONST c) => (expr e,I.CONSTop c)
481 :     | T.ADD(_,T.CONST c,e) => (expr e,I.CONSTop c)
482 :     | T.SUB(_,e,T.LI n) => makeEA(expr e,~n)
483 :     | e => makeEA(expr e,0)
484 :    
485 :     (* compute base+displacement+small offset *)
486 :     and offset(base,disp as I.IMMop n,off) =
487 :     let val n' = n+off
488 :     in if literal16 n' then (base,I.IMMop n')
489 :     else
490 :     let val tmp = newReg()
491 :     in emit(I.OPERATE{oper=I.ADDQ,ra=base,rb=disp,rc=tmp});
492 :     (tmp,I.IMMop off)
493 :     end
494 :     end
495 :     | offset(base,disp,off) =
496 :     let val tmp = newReg()
497 :     in emit(I.OPERATE{oper=I.ADDQ,ra=base,rb=disp,rc=tmp});
498 :     (tmp,I.IMMop off)
499 :     end
500 :    
501 :     (* check if base offset *)
502 :     and makeEA(base, offset) =
503 :     if ~32768 <= offset andalso offset <= 32767 then (base, I.IMMop offset)
504 :     else
505 :     let val tmpR = newReg()
506 :     (* unsigned low 16 bits *)
507 :     val low = wtoi(Word.andb(itow offset, 0wxffff))
508 :     val high = offset div 65536
509 :     val (lowsgn, highsgn) = (* Sign-extend *)
510 :     if low <= 32767 then (low, high) else (low -65536, high+1)
511 :     in
512 :     (emit(I.LDAH{r=tmpR, b=base, d=I.IMMop highsgn});
513 :     (tmpR, I.IMMop lowsgn))
514 :     end
515 :    
516 :     (* look for multiply by 4 and 8 of the given type *)
517 :     and times4or8(ty,e) =
518 :     let fun f(t,a,n) = if t = ty then
519 :     if n = 4 then (TIMES4,a)
520 :     else if n = 8 then (TIMES8,a)
521 :     else (TIMES1,e)
522 :     else (TIMES1,e)
523 :     fun g(t,a,n) = if t = ty then
524 :     if n = 0w4 then (TIMES4,a)
525 :     else if n = 0w8 then (TIMES8,a)
526 :     else (TIMES1,e)
527 :     else (TIMES1,e)
528 :     fun u(t,a,n) = if t = ty then
529 :     if n = 2 then (TIMES4,a)
530 :     else if n = 3 then (TIMES8,a)
531 :     else (TIMES1,e)
532 :     else (TIMES1,e)
533 :     fun v(t,a,n) = if t = ty then
534 :     if n = 0w2 then (TIMES4,a)
535 :     else if n = 0w3 then (TIMES8,a)
536 :     else (TIMES1,e)
537 :     else (TIMES1,e)
538 :     in case e of
539 :     T.MULU(t,a,T.LI n) => f(t,a,n)
540 :     | T.MULU(t,a,T.LI32 n) => g(t,a,n)
541 :     | T.MULS(t,T.LI n,a) => f(t,a,n)
542 :     | T.MULS(t,T.LI32 n,a) => g(t,a,n)
543 :     | T.SLL(t,a,T.LI n) => u(t,a,n)
544 :     | T.SLL(t,a,T.LI32 n) => v(t,a,n)
545 :     | _ => (TIMES1,e)
546 :     end
547 :    
548 :     (* generate an add instruction
549 :     * ***OPT*** look for multiply by 4 and 8 and use the S4ADD and S8ADD
550 :     * forms.
551 :     *)
552 :     and plus(ty,add,s4add,s8add,a,b,d,an) =
553 :     (case times4or8(ty,a) of
554 :     (TIMES4,a) => arith(s4add,a,b,d,an)
555 :     | (TIMES8,a) => arith(s8add,a,b,d,an)
556 :     | _ =>
557 :     case times4or8(ty,b) of
558 :     (TIMES4,b) => arith'(s4add,b,a,d,an)
559 :     | (TIMES8,b) => arith'(s8add,b,a,d,an)
560 :     | _ => commArith(add,a,b,d,an)
561 :     )
562 :    
563 :     (* generate a subtract instruction
564 :     * ***OPT*** look for multiply by 4 and 8
565 :     *)
566 :     and minus(ty,sub,s4sub,s8sub,a,b,d,an) =
567 :     (case times4or8(ty,a) of
568 :     (TIMES4,a) => arith(s4sub,a,b,d,an)
569 :     | (TIMES8,a) => arith(s8sub,a,b,d,an)
570 :     | _ =>
571 : monnier 429 if ty = 64 then
572 : monnier 409 (case b of
573 :     (* use LDA to handle subtraction when possible
574 :     * Note: this may have sign extension problems later.
575 :     *)
576 :     T.LI i => (loadImmed(~i,expr a,d,an) handle Overflow =>
577 :     arith(sub,a,b,d,an))
578 :     | _ => arith(sub,a,b,d,an)
579 : monnier 429 ) else arith(sub,a,b,d,an)
580 : monnier 409 )
581 :    
582 :     (* look for special constants *)
583 :     and wordOpn(T.LI n) = SOME(W32.fromInt n)
584 :     | wordOpn(T.LI32 w) = SOME w
585 :     | wordOpn e = NONE
586 :    
587 : monnier 429 (* look for special byte mask constants
588 :     * IMPORTANT: we must ALWAYS keep the sign bit!
589 :     *)
590 :     and byteMask(_,SOME 0wx00000000) = 0xff
591 :     | byteMask(_,SOME 0wx000000ff) = 0xfe
592 :     | byteMask(_,SOME 0wx0000ff00) = 0xfd
593 :     | byteMask(_,SOME 0wx0000ffff) = 0xfc
594 :     | byteMask(_,SOME 0wx00ff0000) = 0xfb
595 :     | byteMask(_,SOME 0wx00ff00ff) = 0xfa
596 :     | byteMask(_,SOME 0wx00ffff00) = 0xf9
597 :     | byteMask(_,SOME 0wx00ffffff) = 0xf8
598 :     | byteMask(ty,SOME 0wxff000000) = if ty = 64 then 0xf7 else 0x07
599 :     | byteMask(ty,SOME 0wxff0000ff) = if ty = 64 then 0xf6 else 0x06
600 :     | byteMask(ty,SOME 0wxff00ff00) = if ty = 64 then 0xf5 else 0x05
601 :     | byteMask(ty,SOME 0wxff00ffff) = if ty = 64 then 0xf4 else 0x04
602 :     | byteMask(ty,SOME 0wxffff0000) = if ty = 64 then 0xf3 else 0x03
603 :     | byteMask(ty,SOME 0wxffff00ff) = if ty = 64 then 0xf2 else 0x02
604 :     | byteMask(ty,SOME 0wxffffff00) = if ty = 64 then 0xf1 else 0x01
605 :     | byteMask(ty,SOME 0wxffffffff) = if ty = 64 then 0xf0 else 0x00
606 :     | byteMask _ = ~1
607 : monnier 409
608 :     (* generate an and instruction
609 :     * look for special masks.
610 :     *)
611 :     and andb(ty,a,b,d,an) =
612 :     case byteMask(ty,wordOpn a) of
613 : monnier 429 ~1 => (case byteMask(ty,wordOpn b) of
614 :     ~1 => commArith(I.AND,a,b,d,an)
615 :     | mask => arith(I.ZAP,a,T.LI mask,d,an)
616 :     )
617 :     | mask => arith(I.ZAP,b,T.LI mask,d,an)
618 : monnier 409
619 :     (* generate sll/sra/srl *)
620 :     and sll32(a,b,d,an) =
621 :     case wordOpn b of
622 :     SOME 0w0 => doExpr(a,d,an)
623 :     | SOME 0w1 =>
624 :     let val r = T.REG(32,expr a) in arith(I.ADDL,r,r,d,an) end
625 :     | SOME 0w2 => arith(I.S4ADDL,a,zeroT,d,an)
626 :     | SOME 0w3 => arith(I.S8ADDL,a,zeroT,d,an)
627 :     | _ => let val t = newReg()
628 :     in arith(I.SLL,a,b,t,an);
629 :     signExt32(t,d)
630 :     end
631 :    
632 :     and sll64(a,b,d,an) =
633 :     case wordOpn b of
634 :     SOME 0w0 => doExpr(a,d,an)
635 :     | SOME 0w1 =>
636 :     let val r = T.REG(64,expr a) in arith(I.ADDQ,r,r,d,an) end
637 :     | SOME 0w2 => arith(I.S4ADDQ,a,zeroT,d,an)
638 :     | SOME 0w3 => arith(I.S8ADDQ,a,zeroT,d,an)
639 :     | _ => arith(I.SLL,a,b,d,an)
640 :    
641 :     and sra32(a,b,d,an) =
642 :     let val ra = expr a
643 :     val rb = opn b
644 :     val t = newReg()
645 : monnier 475 in (* On the alpha, all 32 bit values are already sign extended.
646 :     * So no sign extension is necessary.
647 :     * signExt32(ra,t);
648 :     * mark(I.OPERATE{oper=I.SRA,ra=t,rb=rb,rc=d},an)
649 :     *)
650 :     mark(I.OPERATE{oper=I.SRA,ra=ra,rb=rb,rc=d},an)
651 : monnier 409 end
652 :    
653 :     and sra64(a,b,d,an) =
654 :     mark(I.OPERATE{oper=I.SRA,ra=expr a,rb=opn b,rc=d},an)
655 :    
656 :     and srl32(a,b,d,an) =
657 :     let val ra = expr a
658 :     val rb = opn b
659 :     val t = newReg()
660 :     in emit(I.OPERATE{oper=I.ZAP,ra=ra,rb=I.IMMop 0xf0,rc=t});
661 :     mark(I.OPERATE{oper=I.SRL,ra=t,rb=rb,rc=d},an)
662 :     end
663 :    
664 :     and srl64(a,b,d,an) =
665 :     mark(I.OPERATE{oper=I.SRL,ra=expr a,rb=opn b,rc=d},an)
666 :    
667 :     (*
668 :     * Generic multiply.
669 :     * We first try to use the multiply by constant heuristic
670 :     *)
671 :     and multiply(ty,gen,genConst,e1,e2,rd,trapb,an) =
672 :     let fun nonconst(e1,e2) =
673 :     let val instr =
674 :     case (opn e1,opn e2) of
675 :     (i,I.REGop r) => gen{ra=r,rb=i,rc=rd}
676 :     | (I.REGop r,i) => gen{ra=r,rb=i,rc=rd}
677 :     | (r,i) => gen{ra=reduceOpn r,rb=i,rc=rd}
678 :     in mark'(instr,an)::trapb end
679 :     fun const(e,i) =
680 :     let val r = expr e
681 :     in if !useMultByConst andalso i >= 0 andalso i < 256 then
682 :     mark'(gen{ra=r,rb=I.IMMop i,rc=rd},an)::trapb
683 :     else
684 :     (genConst{r=r,i=i,d=rd}@trapb
685 :     handle _ => nonconst(T.REG(ty,r),T.LI i))
686 :     end
687 :     fun constw(e,i) = const(e,Word32.toInt i)
688 :     handle _ => nonconst(e,T.LI32 i)
689 :     val instrs =
690 :     case (e1,e2) of
691 :     (e1,T.LI i) => const(e1,i)
692 :     | (e1,T.LI32 i) => constw(e1,i)
693 :     | (T.LI i,e2) => const(e2,i)
694 :     | (T.LI32 i,e2) => constw(e2,i)
695 :     | _ => nonconst(e1,e2)
696 :     in app emit instrs
697 :     end
698 :    
699 :     (* Round r towards zero.
700 :     * I generate the following sequence of code, which should get
701 :     * mapped into conditional moves.
702 :     *
703 :     * d <- r + i;
704 :     * d <- if (r > 0) then r else d
705 :     *)
706 :     and roundToZero{ty,r,i,d} =
707 :     (doStmt(T.MV(ty,d,T.ADD(ty,T.REG(ty,r),T.LI i)));
708 :     doStmt(T.MV(ty,d,T.COND(ty,T.CMP(ty,T.GE,T.REG(ty,r),T.LI 0),
709 :     T.REG(ty,r),T.REG(ty,d))))
710 :     )
711 :    
712 :     (*
713 :     * Generic division.
714 :     * We first try to use the division by constant heuristic
715 :     *)
716 :     and divide(ty,pseudo,genDiv,e1,e2,rd,an) =
717 :     let fun nonconst(e1,e2) =
718 :     pseudo({ra=expr e1,rb=opn e2,rc=rd},reduceOpn)
719 :    
720 :     fun const(e,i) =
721 :     let val r = expr e
722 :     in genDiv{mode=T.TO_ZERO,roundToZero=roundToZero}
723 :     {r=r,i=i,d=rd}
724 :     handle _ => nonconst(T.REG(ty,r),T.LI i)
725 :     end
726 :     fun constw(e,i) = const(e,Word32.toInt i)
727 :     handle _ => nonconst(e,T.LI32 i)
728 :     val instrs =
729 :     case e2 of
730 :     T.LI i => const(e1,i)
731 :     | T.LI32 i => constw(e1,i)
732 :     | _ => nonconst(e1,e2)
733 :     in app emit instrs
734 :     end
735 :    
736 :    
737 :     (*
738 :     and multTrap(MULV,ADD,ADDV,e1,e2,rd,an) = (* signed multiply and trap *)
739 :     let val ADD = fn {ra,rb,rc} => I.OPERATE{oper=ADD,ra=ra,rb=rb,rc=rc}
740 :     val ADDV = fn {ra,rb,rc} => I.OPERATEV{oper=ADDV,ra=ra,rb=rb,rc=rc}
741 :     val MULV = fn {ra,rb,rc} => I.OPERATEV{oper=MULV,ra=ra,rb=rb,rc=rc}
742 :     in multiply(MULV,ADD,ADDV,e1,e2,rd,an);
743 :     emit(I.TRAPB)
744 :     end
745 :    
746 :     and mulu(MUL,ADD,e1,e2,rd,an) = (* unsigned multiply *)
747 :     let val ADD = fn {ra,rb,rc} => I.OPERATE{oper=ADD,ra=ra,rb=rb,rc=rc}
748 :     val MUL = fn {ra,rb,rc} => I.OPERATE{oper=MUL,ra=ra,rb=rb,rc=rc}
749 :     in multiply(MUL,ADD,ADD,e1,e2,rd,an)
750 :     end
751 :    
752 :     (* Multiplication *)
753 :     and multiply(MULV, ADD, ADDV, e1, e2, rd, an) =
754 :     let val reg = expr e1
755 :     val opn = opn e2
756 :     fun emitMulvImmed (reg, 0, rd) =
757 :     emit(I.LDA{r=rd, b=zeroR, d=I.IMMop 0})
758 :     | emitMulvImmed (reg, 1, rd) =
759 :     emit(ADD{ra=reg, rb=zeroOpn, rc=rd})
760 :     | emitMulvImmed (reg, multiplier, rd) =
761 :     let fun log2 0w1 = 0 | log2 n = 1 + (log2 (Word.>> (n, 0w1)))
762 :     fun exp2 n = Word.<<(0w1, n)
763 :     fun bitIsSet (x,n) = Word.andb(x,exp2 n) <> 0w0
764 :     fun loop (~1) = ()
765 :     | loop n =
766 :     (if bitIsSet(itow multiplier, itow n) then
767 :     emit(ADDV{ra=reg,rb=I.REGop rd,rc=rd})
768 :     else ();
769 :     if n>0 then
770 :     emit(ADDV{ra=rd,rb=I.REGop rd,rc=rd})
771 :     else ();
772 :     loop (n-1))
773 :     in emit(ADDV{ra=reg, rb=I.REGop reg, rc=rd});
774 :     loop ((log2 (itow multiplier)) - 1)
775 :     end
776 :     in case opn of
777 :     (I.IMMop multiplier) => emitMulvImmed (reg, multiplier, rd)
778 :     | _ => mark(MULV{ra=reg, rb=opn, rc=rd},an)
779 :     (*esac*)
780 :     end
781 :     *)
782 :    
783 :     (* generate pseudo instruction *)
784 :     and pseudo(instr,e1,e2,rc) =
785 :     app emit (instr({ra=expr e1,rb=opn e2,rc=rc}, reduceOpn))
786 :    
787 :     (* generate a load *)
788 :     and load(ldOp,ea,d,mem,an) =
789 :     let val (base,disp) = addr ea
790 :     in mark(I.LOAD{ldOp=ldOp,r=d,b=base,d=disp,mem=mem},an) end
791 :    
792 :     (* generate a load with zero extension *)
793 :     and loadZext(ea,rd,mem,EXT,an) =
794 :     let val (b, d) = addr ea
795 :     val t1 = newReg()
796 :     val _ = mark(I.LOAD{ldOp=I.LDQ_U, r=t1, b=b, d=d, mem=mem},an);
797 :     val t2 = lda(b,d)
798 :     in emit(I.OPERATE{oper=EXT, ra=t1, rb=I.REGop t2, rc=rd}) end
799 :    
800 :     (* generate a load with sign extension *)
801 :     and loadSext(ea,rd,mem,off,EXT,shift,an) =
802 :     let val (b, d) = addr ea
803 :     val (b',d') = offset(b,d,off)
804 :     val t1 = newReg()
805 :     val t2 = newReg()
806 :     val t3 = newReg()
807 :     in mark(I.LOAD{ldOp=I.LDQ_U, r=t1, b=b, d=d, mem=mem},an);
808 :     emit(I.LDA{r=t2, b=b', d=d'});
809 :     emit(I.OPERATE{oper=EXT, ra=t1, rb=I.REGop t2, rc=t3});
810 :     emit(I.OPERATE{oper=I.SRA, ra=t3, rb=I.IMMop shift, rc=rd})
811 :     end
812 :    
813 :     (* generate a load byte with zero extension (page 4-48) *)
814 :     and load8(ea,rd,mem,an) = loadZext(ea,rd,mem,I.EXTBL,an)
815 :    
816 :     (* generate a load byte with sign extension (page 4-48) *)
817 :     and load8s(ea,rd,mem,an) = loadSext(ea,rd,mem,1,I.EXTQH,56,an)
818 :    
819 :     (* generate a load 16 bit *)
820 :     and load16(ea,rd,mem,an) = loadZext(ea,rd,mem,I.EXTWL,an)
821 :    
822 :     (* generate a load 16 bit with sign extension *)
823 :     and load16s(ea,rd,mem,an) = loadSext(ea,rd,mem,2,I.EXTQH,48,an)
824 :    
825 :     (* generate a load 32 bit with sign extension *)
826 :     and load32s(ea,rd,mem,an) = load(I.LDL,ea,rd,mem,an)
827 :    
828 :     (* generate a floating point load *)
829 :     and fload(ldOp,ea,d,mem,an) =
830 :     let val (base,disp) = addr ea
831 :     in mark(I.FLOAD{ldOp=ldOp,r=d,b=base,d=disp,mem=mem},an) end
832 :    
833 :     (* generate a store *)
834 :     and store(stOp,ea,data,mem,an) =
835 :     let val (base,disp) = addr ea
836 :     in mark(I.STORE{stOp=stOp,r=expr data,b=base,d=disp,mem=mem},an) end
837 :    
838 :     (* generate an store8 or store16 *)
839 :     and storeUnaligned(ea,data,mem,INS,MSK,an) =
840 :     let val (base,disp) = addr ea
841 :     val data = expr data
842 :     val t1 = newReg()
843 :     val t3 = newReg()
844 :     val t4 = newReg()
845 :     val t5 = newReg()
846 :     val _ = emit(I.LOAD{ldOp=I.LDQ_U, r=t1, b=base, d=disp, mem=mem})
847 :     val t2 = lda(base,disp)
848 :     in emit(I.OPERATE{oper=INS, ra=data, rb=I.REGop(t2), rc=t3});
849 :     emit(I.OPERATE{oper=MSK, ra=t1, rb=I.REGop(t2), rc=t4});
850 :     emit(I.OPERATE{oper=I.BIS, ra=t4, rb=I.REGop(t3), rc=t5});
851 :     mark(I.STORE{stOp=I.STQ_U, r=t5, b=base, d=disp, mem=mem},an)
852 :     end
853 :    
854 :     (* generate a store byte *)
855 :     and store8(ea,data,mem,an) =
856 :     storeUnaligned(ea,data,mem,I.INSBL,I.MSKBL,an)
857 :    
858 :     (* generate a store16 *)
859 :     and store16(ea,data,mem,an) =
860 :     storeUnaligned(ea,data,mem,I.INSWL,I.MSKWL,an)
861 :    
862 : monnier 429 (* generate conversion from floating point to integer *)
863 :     and cvtf2i(pseudo,rounding,e,rd,an) =
864 :     app emit (pseudo{mode=rounding, fs=fexpr e, rd=rd})
865 :    
866 : monnier 409 (* generate an expression and return the register that holds the result *)
867 :     and expr(T.REG(_,r)) = r
868 :     | expr(T.LI 0) = zeroR
869 :     | expr(T.LI32 0w0) = zeroR
870 :     | expr e = let val r = newReg()
871 :     in doExpr(e,r,[]); r end
872 :    
873 :     (* generate an expression that targets register d *)
874 : monnier 429 and doExpr(exp,d,an) =
875 :     case exp of
876 : monnier 409 T.REG(_,r) => move(r,d,an)
877 :     | T.LI n => loadImmed(n,zeroR,d,an)
878 :     | T.LI32 w => loadImmed32(w,zeroR,d,an)
879 :     | T.LABEL l => loadLabel(l,d,an)
880 :     | T.CONST c => loadConst(c,d,an)
881 :    
882 :     (* special optimizations for additions and subtraction
883 :     * Question: using LDA for all widths is not really correct
884 :     * since the result may not fit into the sign extension scheme.
885 :     *)
886 : monnier 429 | T.ADD(64,e,T.LABEL le) => mark(I.LDA{r=d,b=expr e,d=I.LABop le},an)
887 :     | T.ADD(64,T.LABEL le,e) => mark(I.LDA{r=d,b=expr e,d=I.LABop le},an)
888 :     | T.ADD(64,e,T.CONST c) => mark(I.LDA{r=d,b=expr e,d=I.CONSTop c},an)
889 :     | T.ADD(64,T.CONST c,e) => mark(I.LDA{r=d,b=expr e,d=I.CONSTop c},an)
890 :     | T.ADD(64,e,T.LI i) => loadImmed(i, expr e, d, an)
891 :     | T.ADD(64,T.LI i,e) => loadImmed(i, expr e, d, an)
892 :     | T.ADD(64,e,T.LI32 i) => loadImmed32(i, expr e, d, an)
893 :     | T.ADD(64,T.LI32 i,e) => loadImmed32(i, expr e, d, an)
894 : monnier 409 | T.SUB(_,a,(T.LI 0 | T.LI32 0w0)) => doExpr(a,d,an)
895 :    
896 :     (* 32-bit support *)
897 :     | T.ADD(32,a,b) => plus(32,I.ADDL,I.S4ADDL,I.S8ADDL,a,b,d,an)
898 :     | T.SUB(32,a,b) => minus(32,I.SUBL,I.S4SUBL,I.S8SUBL,a,b,d,an)
899 :     | T.ADDT(32,a,b) => commArithTrap(I.ADDLV,a,b,d,an)
900 :     | T.SUBT(32,a,b) => arithTrap(I.SUBLV,a,b,d,an)
901 : monnier 429 | T.MULT(32,a,b) =>
902 : monnier 409 multiply(32,
903 :     fn{ra,rb,rc} => I.OPERATEV{oper=I.MULLV,ra=ra,rb=rb,rc=rc},
904 :     Mult32.multiply,a,b,d,trapb,an)
905 : monnier 429 | T.MULU(32,a,b) =>
906 : monnier 409 multiply(32,
907 :     fn{ra,rb,rc} => I.OPERATE{oper=I.MULL,ra=ra,rb=rb,rc=rc},
908 :     Mulu32.multiply,a,b,d,[],an)
909 : monnier 429 | T.MULS(32,a,b) =>
910 :     multiply(32,
911 :     fn{ra,rb,rc} => I.OPERATE{oper=I.MULL,ra=ra,rb=rb,rc=rc},
912 :     Muls32.multiply,a,b,d,[],an)
913 :     | T.DIVT(32,a,b) => divide(32,P.divlv,Mult32.divide,a,b,d,an)
914 :     | T.DIVU(32,a,b) => divide(32,P.divlu,Mulu32.divide,a,b,d,an)
915 :     | T.DIVS(32,a,b) => divide(32,P.divl,Muls32.divide,a,b,d,an)
916 :     | T.REMT(32,a,b) => pseudo(P.remlv,a,b,d)
917 :     | T.REMU(32,a,b) => pseudo(P.remlu,a,b,d)
918 :     | T.REMS(32,a,b) => pseudo(P.reml,a,b,d)
919 :    
920 : monnier 409 | T.SLL(32,a,b) => sll32(a,b,d,an)
921 :     | T.SRA(32,a,b) => sra32(a,b,d,an)
922 :     | T.SRL(32,a,b) => srl32(a,b,d,an)
923 :    
924 :     (* 64 bit support *)
925 :     | T.ADD(64,a,b) => plus(64,I.ADDQ,I.S4ADDQ,I.S8ADDQ,a,b,d,an)
926 :     | T.SUB(64,a,b) => minus(64,I.SUBQ,I.S4SUBQ,I.S8SUBQ,a,b,d,an)
927 :     | T.ADDT(64,a,b) => commArithTrap(I.ADDQV,a,b,d,an)
928 :     | T.SUBT(64,a,b) => arithTrap(I.SUBQV,a,b,d,an)
929 : monnier 429 | T.MULT(64,a,b) =>
930 : monnier 409 multiply(64,
931 :     fn{ra,rb,rc} => I.OPERATEV{oper=I.MULQV,ra=ra,rb=rb,rc=rc},
932 :     Mult64.multiply,a,b,d,trapb,an)
933 : monnier 429 | T.MULU(64,a,b) =>
934 : monnier 409 multiply(64,
935 :     fn{ra,rb,rc} => I.OPERATE{oper=I.MULQ,ra=ra,rb=rb,rc=rc},
936 :     Mulu64.multiply,a,b,d,[],an)
937 : monnier 429 | T.MULS(64,a,b) =>
938 :     multiply(64,
939 :     fn{ra,rb,rc} => I.OPERATE{oper=I.MULQ,ra=ra,rb=rb,rc=rc},
940 :     Muls64.multiply,a,b,d,[],an)
941 :     | T.DIVT(64,a,b) => divide(64,P.divqv,Mult64.divide,a,b,d,an)
942 :     | T.DIVU(64,a,b) => divide(64,P.divqu,Mulu64.divide,a,b,d,an)
943 :     | T.DIVS(64,a,b) => divide(64,P.divq,Muls64.divide,a,b,d,an)
944 :     | T.REMT(64,a,b) => pseudo(P.remqv,a,b,d)
945 :     | T.REMU(64,a,b) => pseudo(P.remqu,a,b,d)
946 :     | T.REMS(64,a,b) => pseudo(P.remq,a,b,d)
947 :    
948 : monnier 409 | T.SLL(64,a,b) => sll64(a,b,d,an)
949 :     | T.SRA(64,a,b) => sra64(a,b,d,an)
950 :     | T.SRL(64,a,b) => srl64(a,b,d,an)
951 :    
952 :     (* special bit operations with complement *)
953 :     | T.ANDB(_,a,T.NOTB(_,b)) => arith(I.BIC,a,b,d,an)
954 :     | T.ORB(_,a,T.NOTB(_,b)) => arith(I.ORNOT,a,b,d,an)
955 :     | T.XORB(_,a,T.NOTB(_,b)) => commArith(I.EQV,a,b,d,an)
956 :     | T.ANDB(_,T.NOTB(_,a),b) => arith(I.BIC,b,a,d,an)
957 :     | T.ORB(_,T.NOTB(_,a),b) => arith(I.ORNOT,b,a,d,an)
958 :     | T.XORB(_,T.NOTB(_,a),b) => commArith(I.EQV,b,a,d,an)
959 :     | T.NOTB(_,T.XORB(_,a,b)) => commArith(I.EQV,b,a,d,an)
960 :    
961 :     (* bit operations *)
962 :     | T.ANDB(ty,a,b) => andb(ty,a,b,d,an)
963 :     | T.XORB(_,a,b) => commArith(I.XOR,a,b,d,an)
964 :     | T.ORB(_,a,b) => commArith(I.BIS,a,b,d,an)
965 :     | T.NOTB(_,e) => arith(I.ORNOT,zeroT,e,d,an)
966 :    
967 :     (* loads *)
968 : monnier 475 | T.CVTI2I(_,T.SIGN_EXTEND,_,T.LOAD(8,ea,mem)) =>
969 :     load8s(ea,d,mem,an)
970 :     | T.CVTI2I(_,T.SIGN_EXTEND,_,T.LOAD(16,ea,mem)) =>
971 :     load16s(ea,d,mem,an)
972 :     | T.CVTI2I(_,T.SIGN_EXTEND,_,T.LOAD(32,ea,mem)) =>
973 :     load32s(ea,d,mem,an)
974 : monnier 409 | T.LOAD(8,ea,mem) => load8(ea,d,mem,an)
975 :     | T.LOAD(16,ea,mem) => load16(ea,d,mem,an)
976 : monnier 429 | T.LOAD(32,ea,mem) => load32s(ea,d,mem,an)
977 : monnier 409 | T.LOAD(64,ea,mem) => load(I.LDQ,ea,d,mem,an)
978 :    
979 : monnier 429 (* floating -> int conversion *)
980 : monnier 475 | T.CVTF2I(ty,rounding,fty,e) =>
981 :     (case (fty,ty) of
982 : monnier 429 (32,32) => cvtf2i(P.cvtsl,rounding,e,d,an)
983 :     | (32,64) => cvtf2i(P.cvtsq,rounding,e,d,an)
984 :     | (64,32) => cvtf2i(P.cvttl,rounding,e,d,an)
985 :     | (64,64) => cvtf2i(P.cvttq,rounding,e,d,an)
986 :     | _ => doExpr(Gen.compile exp,d,an) (* other cases *)
987 :     )
988 :    
989 : monnier 409 (* conversion to boolean *)
990 :     | T.COND(_,T.CMP(ty,cond,e1,e2),T.LI 1,T.LI 0) =>
991 :     compare(ty,cond,e1,e2,d,an)
992 :     | T.COND(_,T.CMP(ty,cond,e1,e2),T.LI 0,T.LI 1) =>
993 : monnier 475 compare(ty,T.Util.negateCond cond,e1,e2,d,an)
994 : monnier 409 | T.COND(_,T.CMP(ty,cond,e1,e2),x,y) =>
995 :     cmove(ty,cond,e1,e2,x,y,d,an)
996 :    
997 :     | T.SEQ(s,e) => (doStmt s; doExpr(e,d,an))
998 :     | T.MARK(e,a) => doExpr(e,d,a::an)
999 : monnier 475
1000 :     (* On the alpha: all 32 bit values are already sign extended.
1001 :     * So no sign extension is necessary
1002 :     *)
1003 :     | T.CVTI2I(64, _, 32, e) => doExpr(e, d, an)
1004 : monnier 429
1005 :     (* Defaults *)
1006 : monnier 409 | e => doExpr(Gen.compile e,d,an)
1007 :    
1008 :     (* Hmmm... this is the funky thing described in the comments
1009 :     * in at the top of the file. This should be made parametrizable
1010 :     * for other backends.
1011 :     *)
1012 :     and farith(opcode,a,b,d,an) =
1013 :     let val fa = fexpr a
1014 :     val fb = fexpr b
1015 :     in emit(I.DEFFREG d);
1016 :     mark(I.FOPERATEV{oper=opcode,fa=fa,fb=fb,fc=d},an);
1017 :     emit(I.TRAPB)
1018 :     end
1019 :    
1020 : monnier 429 and funary(opcode,e,d,an) = mark(I.FUNARY{oper=opcode,fb=fexpr e,fc=d},an)
1021 :    
1022 :    
1023 : monnier 409 (* generate an floating point expression
1024 :     * return the register that holds the result
1025 :     *)
1026 :     and fexpr(T.FREG(_,r)) = r
1027 :     | fexpr e = let val d = newFreg() in doFexpr(e,d,[]); d end
1028 :    
1029 :     (* generate an external floating point operation *)
1030 : monnier 429 and fcvti2f(pseudo,e,fd,an) =
1031 : monnier 409 let val opnd = opn e
1032 : monnier 429 in app emit (pseudo({opnd=opnd, fd=fd}, reduceOpn))
1033 : monnier 409 end
1034 :    
1035 :     (* generate a floating point store *)
1036 :     and fstore(stOp,ea,data,mem,an) =
1037 :     let val (base,disp) = addr ea
1038 :     in mark(I.FSTORE{stOp=stOp,r=fexpr data,b=base,d=disp,mem=mem},an)
1039 :     end
1040 :    
1041 :     (* generate a floating point expression that targets register d *)
1042 :     and doFexpr(e,d,an) =
1043 :     case e of
1044 :     T.FREG(_,f) => fmove(f,d,an)
1045 :    
1046 :     (* single precision support *)
1047 :     | T.FADD(32,a,b) => farith(ADDS,a,b,d,an)
1048 :     | T.FSUB(32,a,b) => farith(SUBS,a,b,d,an)
1049 :     | T.FMUL(32,a,b) => farith(MULS,a,b,d,an)
1050 :     | T.FDIV(32,a,b) => farith(DIVS,a,b,d,an)
1051 :    
1052 :     (* double precision support *)
1053 :     | T.FADD(64,a,b) => farith(ADDT,a,b,d,an)
1054 :     | T.FSUB(64,a,b) => farith(SUBT,a,b,d,an)
1055 :     | T.FMUL(64,a,b) => farith(MULT,a,b,d,an)
1056 :     | T.FDIV(64,a,b) => farith(DIVT,a,b,d,an)
1057 :    
1058 : monnier 429
1059 : monnier 409 (* generic *)
1060 :     | T.FABS(_,a) =>
1061 :     mark(I.FOPERATE{oper=I.CPYS,fa=zeroFR,fb=fexpr a,fc=d},an)
1062 :     | T.FNEG(_,a) =>
1063 :     let val fs = fexpr a
1064 :     in mark(I.FOPERATE{oper=I.CPYSN,fa=fs,fb=fs,fc=d},an) end
1065 :     | T.FSQRT(_,a) => error "fsqrt"
1066 :    
1067 :     (* loads *)
1068 :     | T.FLOAD(32,ea,mem) => fload(I.LDS,ea,d,mem,an)
1069 :     | T.FLOAD(64,ea,mem) => fload(I.LDT,ea,d,mem,an)
1070 : monnier 429
1071 :     (* floating/floating conversion
1072 :     * Note: it is not necessary to convert single precision
1073 :     * to double on the alpha.
1074 :     *)
1075 : monnier 475 | T.CVTF2F(fty,_,fty',e) => (* ignore rounding mode for now *)
1076 :     (case (fty,fty') of
1077 : monnier 429 (64,64) => doFexpr(e,d,an)
1078 :     | (64,32) => doFexpr(e,d,an)
1079 :     | (32,32) => doFexpr(e,d,an)
1080 :     | (32,64) => funary(I.CVTTS,e,d,an) (* use normal rounding *)
1081 :     | _ => error "CVTF2F"
1082 :     )
1083 : monnier 409
1084 : monnier 429 (* integer -> floating point conversion *)
1085 : monnier 475 | T.CVTI2F(fty,T.SIGN_EXTEND,ty,e) =>
1086 : monnier 429 let val pseudo =
1087 : monnier 475 case (ty,fty) of
1088 : monnier 429 (ty,32) => if ty <= 32 then P.cvtls else P.cvtqs
1089 :     | (ty,64) => if ty <= 32 then P.cvtlt else P.cvtqt
1090 :     | _ => error "CVTI2F"
1091 :     in fcvti2f(pseudo,e,d,an) end
1092 :    
1093 : monnier 409 (* misc *)
1094 :     | T.FSEQ(s,e) => (doStmt s; doFexpr(e,d,an))
1095 :     | T.FMARK(e,a) => doFexpr(e,d,a::an)
1096 :    
1097 :     | _ => error "doFexpr"
1098 :    
1099 :     (* check whether an expression is andb(e,1) *)
1100 :     and isAndb1(T.ANDB(_,e,T.LI 1)) = (true,e)
1101 :     | isAndb1(T.ANDB(_,e,T.LI32 0w1)) = (true,e)
1102 :     | isAndb1(T.ANDB(_,T.LI 1,e)) = (true,e)
1103 :     | isAndb1(T.ANDB(_,T.LI32 0w1,e)) = (true,e)
1104 :     | isAndb1 e = (false,e)
1105 :    
1106 :     and zeroOrOne(T.LI 0) = ZERO
1107 :     | zeroOrOne(T.LI32 0w0) = ZERO
1108 :     | zeroOrOne(T.LI 1) = ONE
1109 :     | zeroOrOne(T.LI32 0w1) = ONE
1110 :     | zeroOrOne _ = OTHER
1111 :    
1112 :     (* compile a branch *)
1113 :     and branch(c,e,lab,an) =
1114 :     case e of
1115 :     T.CMP(ty,cc,e1 as T.LI _,e2) =>
1116 : monnier 475 branchBS(ty,T.Util.swapCond cc,e2,e1,lab,an)
1117 : monnier 409 | T.CMP(ty,cc,e1 as T.LI32 _,e2) =>
1118 : monnier 475 branchBS(ty,T.Util.swapCond cc,e2,e1,lab,an)
1119 : monnier 409 | T.CMP(ty,cc,e1,e2) => branchBS(ty,cc,e1,e2,lab,an)
1120 :     | e => mark(I.BRANCH(I.BNE,ccExpr e,lab),an)
1121 :    
1122 :     and br(opcode,exp,lab,an) = mark(I.BRANCH(opcode,expr exp,lab),an)
1123 :    
1124 :     (* Use the branch on bit set/clear instruction when possible *)
1125 :     and branchBS(ty,cc,a,b,lab,an) =
1126 :     (case (cc,isAndb1 a,zeroOrOne b) of
1127 :     (T.EQ,(true,e),ONE) => br(I.BLBS,e,lab,an)
1128 :     | (T.EQ,(true,e),ZERO) => br(I.BLBC,e,lab,an)
1129 :     | (T.NE,(true,e),ZERO) => br(I.BLBS,e,lab,an)
1130 :     | (T.NE,(true,e),ONE) => br(I.BLBC,e,lab,an)
1131 :     | (cc,_,_) => branchIt(ty,cc,a,b,lab,an)
1132 :     )
1133 :    
1134 :     (* generate a branch instruction.
1135 :     * Check for branch on zero as a special case
1136 :     *)
1137 :     and branchIt(ty,cc,e,T.LI 0,lab,an) = branchIt0(cc,e,lab,an)
1138 :     | branchIt(ty,cc,e,T.LI32 0w0,lab,an) = branchIt0(cc,e,lab,an)
1139 :     | branchIt(ty,cc,e1,e2,lab,an) = branchItOther(ty,cc,e1,e2,lab,an)
1140 :    
1141 :     (* generate a branch instruction.
1142 :     * This function optimizes the special case of comparison with zero.
1143 :     *)
1144 :     and branchIt0(T.EQ,e,lab,an) = br(I.BEQ,e,lab,an)
1145 :     | branchIt0(T.NE,e,lab,an) = br(I.BNE,e,lab,an)
1146 :     | branchIt0(T.GT,e,lab,an) = br(I.BGT,e,lab,an)
1147 :     | branchIt0(T.GE,e,lab,an) = br(I.BGE,e,lab,an)
1148 :     | branchIt0(T.LE,e,lab,an) = br(I.BLE,e,lab,an)
1149 :     | branchIt0(T.LT,e,lab,an) = br(I.BLT,e,lab,an)
1150 :     | branchIt0(T.GTU,e,lab,an) = br(I.BNE,e,lab,an) (* always > 0! *)
1151 :     | branchIt0(T.GEU,e,lab,an) = (* always true! *) goto(lab,an)
1152 :     | branchIt0(T.LTU,e,lab,an) = (* always false! *) ()
1153 :     | branchIt0(T.LEU,e,lab,an) = br(I.BEQ,e,lab,an) (* never < 0! *)
1154 :    
1155 :     (* Generate the operands for unsigned comparisons
1156 :     * Mask out high order bits whenever necessary.
1157 :     *)
1158 :     and unsignedCmpOpnds(ty,e1,e2) =
1159 :     let fun zapHi(r,mask) =
1160 :     let val d = newReg()
1161 :     in emit(I.OPERATE{oper=I.ZAP, ra=r, rb=I.IMMop mask,rc=d});
1162 :     I.REGop d
1163 :     end
1164 :    
1165 :     fun zap(opn as I.REGop r) =
1166 :     (case ty of
1167 :     8 => zapHi(r,0xfd)
1168 :     | 16 => zapHi(r,0xfc)
1169 :     | 32 => zapHi(r,0xf0)
1170 :     | 64 => opn
1171 :     | _ => error "unsignedCmpOpnds"
1172 :     )
1173 :     | zap opn = opn
1174 :     val opn1 = opn e1
1175 :     val opn2 = opn e2
1176 :     in (zap opn1,zap opn2) end
1177 :    
1178 :     (* Generate a branch *)
1179 :     and branchItOther(ty,cond,e1,e2,lab,an) =
1180 :     let val tmpR = newReg()
1181 :     fun signedCmp(cmp,br) =
1182 :     (emit(I.OPERATE{oper=cmp, ra=expr e1, rb=opn e2, rc=tmpR});
1183 :     mark(I.BRANCH(br, tmpR, lab),an)
1184 :     )
1185 :     fun unsignedCmp(ty,cmp,br) =
1186 :     let val (x,y) = unsignedCmpOpnds(ty,e1,e2)
1187 :     in emit(I.OPERATE{oper=cmp,ra=reduceOpn x,rb=y,rc=tmpR});
1188 :     mark(I.BRANCH(br, tmpR, lab),an)
1189 :     end
1190 :     in case cond of
1191 :     T.LT => signedCmp(I.CMPLT,I.BNE)
1192 :     | T.LE => signedCmp(I.CMPLE,I.BNE)
1193 :     | T.GT => signedCmp(I.CMPLE,I.BEQ)
1194 :     | T.GE => signedCmp(I.CMPLT,I.BEQ)
1195 :     | T.EQ => signedCmp(I.CMPEQ,I.BNE)
1196 :     | T.NE => signedCmp(I.CMPEQ,I.BEQ)
1197 :     | T.LTU => unsignedCmp(ty,I.CMPULT,I.BNE)
1198 :     | T.LEU => unsignedCmp(ty,I.CMPULE,I.BNE)
1199 :     | T.GTU => unsignedCmp(ty,I.CMPULE,I.BEQ)
1200 :     | T.GEU => unsignedCmp(ty,I.CMPULT,I.BEQ)
1201 :     end
1202 :    
1203 :     (* This function generates a conditional move:
1204 :     * d = if cond(a,b) then x else y
1205 :     * Apparently, only signed comparisons conditional moves
1206 :     * are supported on the alpha.
1207 :     *)
1208 :     and cmove(ty,cond,a,b,x,y,d,an) =
1209 :     let val _ = doExpr(y,d,[]) (* evaluate false case *)
1210 :    
1211 :     val (cond,a,b) =
1212 :     (* move the immed operand to b *)
1213 :     case a of
1214 :     (T.LI _ | T.LI32 _ | T.CONST _) =>
1215 : monnier 475 (T.Util.swapCond cond,b,a)
1216 : monnier 409 | _ => (cond,a,b)
1217 :    
1218 :     fun sub(a,(T.LI 0 | T.LI32 0w0)) = expr a
1219 :     | sub(a,b) = expr(T.SUB(ty,a,b))
1220 :    
1221 :     fun cmp(cond,e1,e2) =
1222 :     let val d = newReg()
1223 :     in compare(ty,cond,e1,e2,d,[]); d end
1224 :    
1225 :     val (oper,ra,x,y) =
1226 :     case (cond,isAndb1 a,zeroOrOne b) of
1227 :     (* low bit set/clear? *)
1228 :     (T.EQ,(true,e),ONE) => (I.CMOVLBS,expr e,x,y)
1229 :     | (T.EQ,(true,e),ZERO) => (I.CMOVLBC,expr e,x,y)
1230 :     | (T.NE,(true,e),ZERO) => (I.CMOVLBS,expr e,x,y)
1231 :     | (T.NE,(true,e),ONE) => (I.CMOVLBC,expr e,x,y)
1232 :     (* signed *)
1233 :     | (T.EQ,_,_) => (I.CMOVEQ,sub(a,b),x,y)
1234 :     | (T.NE,_,_) => (I.CMOVEQ,cmp(T.EQ,a,b),y,x)
1235 :     | (T.GT,_,_) => (I.CMOVGT,sub(a,b),x,y)
1236 :     | (T.GE,_,_) => (I.CMOVGE,sub(a,b),x,y)
1237 :     | (T.LT,_,_) => (I.CMOVLT,sub(a,b),x,y)
1238 :     | (T.LE,_,_) => (I.CMOVLE,sub(a,b),x,y)
1239 :    
1240 :     (* unsigned: do compare then use the condition code *)
1241 :     | (T.LTU,_,_) => (I.CMOVEQ,cmp(T.GEU,a,b),x,y)
1242 :     | (T.LEU,_,_) => (I.CMOVEQ,cmp(T.GTU,a,b),x,y)
1243 :     | (T.GTU,_,_) => (I.CMOVEQ,cmp(T.LEU,a,b),x,y)
1244 :     | (T.GEU,_,_) => (I.CMOVEQ,cmp(T.LTU,a,b),x,y)
1245 :     in mark(I.CMOVE{oper=oper,ra=ra,rb=opn x,rc=d},an) (* true case *)
1246 :     end
1247 :    
1248 :    
1249 :     (* This function generates a comparion between e1 and e2 and writes
1250 :     * the result to register d.
1251 :     * It'll mask out the high order 32-bits when performing
1252 :     * unsigned 32-bit integer comparisons.
1253 :     *)
1254 :     and compare(ty,cond,e1,e2,d,an) =
1255 :     let fun signedCmp(oper,a,b,d) =
1256 :     mark(I.OPERATE{oper=oper,ra=expr a,rb=opn b,rc=d},an)
1257 :     fun unsignedCmp(ty,oper,a,b,d) =
1258 :     let val (x,y) = unsignedCmpOpnds(ty,a,b)
1259 :     in mark(I.OPERATE{oper=oper,ra=reduceOpn x,rb=y,rc=d},an)
1260 :     end
1261 :     fun eq(a,b,d) =
1262 :     (case (opn a,opn b) of
1263 :     (a,I.REGop r) =>
1264 :     mark(I.OPERATE{oper=I.CMPEQ,ra=r,rb=a,rc=d},an)
1265 :     | (a,b) =>
1266 :     mark(I.OPERATE{oper=I.CMPEQ,ra=reduceOpn a,rb=b,rc=d},an)
1267 :     )
1268 :     fun neq(a,b,d) =
1269 :     let val tmp = newReg()
1270 :     in eq(a,b,tmp);
1271 :     emit(I.OPERATE{oper=I.CMPEQ,ra=tmp,rb=zeroOpn,rc=d})
1272 :     end
1273 :     val (cond,e1,e2) =
1274 :     case e1 of
1275 :     (T.LI _ | T.LI32 _ | T.CONST _) =>
1276 : monnier 475 (T.Util.swapCond cond,e2,e1)
1277 : monnier 409 | _ => (cond,e1,e2)
1278 :     in case cond of
1279 :     T.EQ => eq(e1,e2,d)
1280 :     | T.NE => neq(e1,e2,d)
1281 :     | T.GT => signedCmp(I.CMPLT,e2,e1,d)
1282 :     | T.GE => signedCmp(I.CMPLE,e2,e1,d)
1283 :     | T.LT => signedCmp(I.CMPLT,e1,e2,d)
1284 :     | T.LE => signedCmp(I.CMPLE,e1,e2,d)
1285 :     | T.GTU => unsignedCmp(ty,I.CMPULT,e2,e1,d)
1286 :     | T.GEU => unsignedCmp(ty,I.CMPULE,e2,e1,d)
1287 :     | T.LTU => unsignedCmp(ty,I.CMPULT,e1,e2,d)
1288 :     | T.LEU => unsignedCmp(ty,I.CMPULE,e1,e2,d)
1289 :     end
1290 :    
1291 :     (* generate an unconditional branch *)
1292 :     and goto(lab,an) = mark(I.BRANCH(I.BR,zeroR,lab),an)
1293 :    
1294 :     (* generate an call instruction *)
1295 :     and call(ea,def,use,mem,an) =
1296 :     let val pv = expr ea
1297 :     val returnPtrR = 26
1298 :     fun live([],acc) = acc
1299 :     | live(T.GPR(T.REG(_,r))::regs,acc) = live(regs, C.addReg(r,acc))
1300 :     | live(T.CCR(T.CC cc)::regs,acc) = live(regs, C.addReg(cc,acc))
1301 :     | live(T.FPR(T.FREG(_,f))::regs,acc) = live(regs, C.addFreg(f,acc))
1302 :     | live(_::regs, acc) = live(regs, acc)
1303 :     in mark(I.JSR({r=returnPtrR, b=pv, d=0},
1304 :     live(def, C.addReg(returnPtrR, C.empty)),
1305 :     live(use, C.addReg(pv, C.empty)),mem),an)
1306 :     end
1307 :    
1308 :     (* generate an floating point branch *)
1309 :     and fbranch(_,T.FCMP(fty,cc,e1,e2),lab,an) =
1310 :     let val f1 = fexpr e1
1311 :     val f2 = fexpr e2
1312 :     fun bcc(cmp,br) =
1313 :     let val tmpR = C.newFreg()
1314 :     in emit(I.DEFFREG(tmpR));
1315 :     emit(I.FOPERATE{oper=cmp,fa=f1,fb=f2,fc=tmpR});
1316 :     emit(I.TRAPB);
1317 :     mark(I.FBRANCH(br,tmpR,lab),an)
1318 :     end
1319 :     fun fall(cmp1, br1, cmp2, br2) =
1320 :     let val tmpR1 = newFreg()
1321 :     val tmpR2 = newFreg()
1322 :     val fallLab = Label.newLabel ""
1323 :     in emit(I.DEFFREG(tmpR1));
1324 :     emit(I.FOPERATE{oper=cmp1, fa=f1, fb=f2, fc=tmpR1});
1325 :     emit(I.TRAPB);
1326 :     mark(I.FBRANCH(br1, tmpR1, fallLab),an);
1327 :     emit(I.DEFFREG(tmpR2));
1328 :     emit(I.FOPERATE{oper=cmp2, fa=f1, fb=f2, fc=tmpR2});
1329 :     emit(I.TRAPB);
1330 :     mark(I.FBRANCH(br2, tmpR2, lab),an);
1331 :     defineLabel fallLab
1332 :     end
1333 :     fun bcc2(cmp1, br1, cmp2, br2) = (bcc(cmp1, br1); bcc(cmp2, br2))
1334 :     in case cc of
1335 :     T.== => bcc(I.CMPTEQSU, I.FBNE)
1336 :     | T.?<> => bcc(I.CMPTEQSU, I.FBEQ)
1337 :     | T.? => bcc(I.CMPTUNSU, I.FBNE)
1338 :     | T.<=> => bcc(I.CMPTUNSU, I.FBEQ)
1339 :     | T.> => fall(I.CMPTLESU, I.FBNE, I.CMPTUNSU, I.FBEQ)
1340 :     | T.>= => fall(I.CMPTLTSU, I.FBNE, I.CMPTUNSU, I.FBEQ)
1341 :     | T.?> => bcc(I.CMPTLESU, I.FBEQ)
1342 :     | T.?>= => bcc(I.CMPTLTSU, I.FBEQ)
1343 :     | T.< => bcc(I.CMPTLTSU, I.FBNE)
1344 :     | T.<= => bcc(I.CMPTLESU, I.FBNE)
1345 :     | T.?< => bcc2(I.CMPTLTSU, I.FBNE, I.CMPTUNSU, I.FBNE)
1346 :     | T.?<= => bcc2(I.CMPTLESU, I.FBNE, I.CMPTUNSU, I.FBNE)
1347 :     | T.<> => fall(I.CMPTEQSU, I.FBNE, I.CMPTUNSU, I.FBEQ)
1348 :     | T.?= => bcc2(I.CMPTEQSU, I.FBNE, I.CMPTUNSU, I.FBNE)
1349 :     end
1350 :     | fbranch _ = error "fbranch"
1351 :    
1352 :     (* generate an floating point branch *)
1353 :     and doCCexpr(T.CC r,d,an) = move(r,d,an)
1354 :     | doCCexpr(T.CMP(ty,cond,e1,e2),d,an) = compare(ty,cond,e1,e2,d,an)
1355 :     | doCCexpr(T.FCMP(fty,cond,e1,e2),d,an) = error "doCCexpr"
1356 :     | doCCexpr(T.CCMARK(e,a),d,an) = doCCexpr(e,d,a::an)
1357 :    
1358 :     and ccExpr(T.CC r) = r
1359 :     | ccExpr e = let val d = newReg()
1360 :     in doCCexpr(e,d,[]); d end
1361 :    
1362 :     (* compile a statement *)
1363 :     and stmt(s,an) =
1364 :     case s of
1365 :     T.MV(ty,r,e) => doExpr(e,r,an)
1366 :     | T.FMV(ty,r,e) => doFexpr(e,r,an)
1367 :     | T.CCMV(r,e) => doCCexpr(e,r,an)
1368 :     | T.COPY(ty,dst,src) => copy(dst,src,an)
1369 :     | T.FCOPY(ty,dst,src) => fcopy(dst,src,an)
1370 :     | T.JMP(T.LABEL(LE.LABEL lab),_) => goto(lab,an)
1371 :     | T.JMP(e,labs) => mark(I.JMPL({r=zeroR,b=expr e,d=0},labs),an)
1372 :     | T.BCC(cond,e,lab) => branch(cond,e,lab,an)
1373 :     | T.FBCC(cond,e,lab) => fbranch(cond,e,lab,an)
1374 :     | T.CALL(e,def,use,mem) => call(e,def,use,mem,an)
1375 :     | T.RET => mark(I.RET{r=zeroR,b=26,d=0},an)
1376 :     | T.STORE(8,ea,data,mem) => store8(ea,data,mem,an)
1377 :     | T.STORE(16,ea,data,mem) => store16(ea,data,mem,an)
1378 :     | T.STORE(32,ea,data,mem) => store(I.STL,ea,data,mem,an)
1379 :     | T.STORE(64,ea,data,mem) => store(I.STQ,ea,data,mem,an)
1380 :     | T.FSTORE(32,ea,data,mem) => fstore(I.STS,ea,data,mem,an)
1381 :     | T.FSTORE(64,ea,data,mem) => fstore(I.STT,ea,data,mem,an)
1382 :     | T.ANNOTATION(s,a) => stmt(s,a::an)
1383 :     | _ => error "stmt"
1384 :    
1385 :     and doStmt s = stmt(s,[])
1386 :    
1387 : monnier 429 (* condition code registers are mapped onto general registers *)
1388 :     fun cc(T.CCR(T.CC cc)) = T.GPR(T.REG(32,cc))
1389 :     | cc r = r
1390 : monnier 409
1391 : monnier 429 in S.STREAM
1392 :     { beginCluster= beginCluster,
1393 :     endCluster = endCluster,
1394 :     emit = doStmt,
1395 :     pseudoOp = pseudoOp,
1396 :     defineLabel = defineLabel,
1397 :     entryLabel = entryLabel,
1398 :     comment = comment,
1399 :     annotation = annotation,
1400 :     exitBlock = fn regs => exitBlock(map cc regs),
1401 :     alias = alias,
1402 :     phi = phi
1403 : monnier 409 }
1404 :     end
1405 :    
1406 :     end
1407 :    

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0