Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Annotation of /sml/branches/SMLNJ/src/MLRISC/alpha32/alpha32.sml
ViewVC logotype

Annotation of /sml/branches/SMLNJ/src/MLRISC/alpha32/alpha32.sml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 113 - (view) (download)

1 : monnier 16 (* alpha32.sml
2 :     *
3 :     * COPYRIGHT (c) 1995 AT&T Bell Laboratories.
4 :     *
5 :     * generates machine code from the mltree.
6 :     *
7 :     *)
8 :    
9 :     (** NOTE: f29 and f30 are used as temporaries.
10 :     ** r31 is always zero.
11 :     **)
12 :     functor Alpha32
13 :     (structure Alpha32Instr : ALPHA32INSTR
14 :     structure Alpha32MLTree : MLTREE
15 :     structure Flowgen : FLOWGRAPH_GEN
16 :     structure PseudoInstrs : ALPHA32_PSEUDO_INSTR
17 :     sharing Alpha32Instr.Region = Alpha32MLTree.Region
18 :     sharing Flowgen.I = PseudoInstrs.I = Alpha32Instr
19 :     sharing Flowgen.T=Alpha32MLTree
20 :     sharing Alpha32MLTree.Constant = Alpha32Instr.Constant) : MLTREECOMP =
21 :     struct
22 :     structure F = Flowgen
23 :     structure T = Alpha32MLTree
24 :     structure R = Alpha32MLTree.Region
25 :     structure I = Alpha32Instr
26 :     structure C = Alpha32Instr.C
27 :     structure LE = LabelExp
28 :     structure W32 = Word32
29 : monnier 106
30 : monnier 16 (*********************************************************
31 :    
32 :     Trap Shadows, Floating Exceptions, and Denormalized
33 :     Numbers on the DEC Alpha
34 :    
35 :     Andrew W. Appel and Lal George
36 :     Nov 28, 1995
37 :    
38 :     See section 4.7.5.1 of the Alpha Architecture Reference Manual.
39 :    
40 :     The Alpha has imprecise exceptions, meaning that if a floating
41 :     point instruction raises an IEEE exception, the exception may
42 :     not interrupt the processor until several successive instructions have
43 :     completed. ML, on the other hand, may want a "precise" model
44 :     of floating point exceptions.
45 :    
46 :     Furthermore, the Alpha hardware does not support denormalized numbers
47 :     (for "gradual underflow"). Instead, underflow always rounds to zero.
48 :     However, each floating operation (add, mult, etc.) has a trapping
49 :     variant that will raise an exception (imprecisely, of course) on
50 :     underflow; in that case, the instruction will produce a zero result
51 :     AND an exception will occur. In fact, there are several variants
52 :     of each instruction; three variants of MULT are:
53 :    
54 :     MULT s1,s2,d truncate denormalized result to zero; no exception
55 :     MULT/U s1,s2,d truncate denormalized result to zero; raise UNDERFLOW
56 :     MULT/SU s1,s2,d software completion, producing denormalized result
57 :    
58 :     The hardware treats the MULT/U and MULT/SU instructions identically,
59 :     truncating a denormalized result to zero and raising the UNDERFLOW
60 :     exception. But the operating system, on an UNDERFLOW exception,
61 :     examines the faulting instruction to see if it's an /SU form, and if so,
62 :     recalculates s1*s2, puts the right answer in d, and continues,
63 :     all without invoking the user's signal handler.
64 :    
65 :     Because most machines compute with denormalized numbers in hardware,
66 :     to maximize portability of SML programs, we use the MULT/SU form.
67 :     (and ADD/SU, SUB/SU, etc.) But to use this form successfully,
68 :     certain rules have to be followed. Basically, d cannot be the same
69 :     register as s1 or s2, because the opsys needs to be able to
70 :     recalculate the operation using the original contents of s1 and s2,
71 :     and the MULT/SU instruction will overwrite d even if it traps.
72 :    
73 :     More generally, we may want to have a sequence of floating-point
74 :     instructions. The rules for such a sequence are:
75 :    
76 :     1. The sequence should end with a TRAPB (trap barrier) instruction.
77 :     (This could be relaxed somewhat, but certainly a TRAPB would
78 :     be a good idea sometime before the next branch instruction or
79 :     update of an ML reference variable, or any other ML side effect.)
80 :     2. No instruction in the sequence should destroy any operand of itself
81 :     or of any previous instruction in the sequence.
82 :     3. No two instructions in the sequence should write the same destination
83 :     register.
84 :    
85 :     We can achieve these conditions by the following trick in the
86 :     Alpha code generator. Each instruction in the sequence will write
87 :     to a different temporary; this is guaranteed by the translation from
88 :     ML-RISC. At the beginning of the sequence, we will put a special
89 :     pseudo-instruction (we call it DEFFREG) that "defines" the destination
90 :     register of the arithmetic instruction. If there are K arithmetic
91 :     instructions in the sequence, then we'll insert K DEFFREG instructions
92 :     all at the beginning of the sequence.
93 :     Then, each arithop will not only "define" its destination temporary
94 :     but will "use" it as well. When all these instructions are fed to
95 :     the liveness analyzer, the resulting interference graph will then
96 :     have inteference edges satisfying conditions 2 and 3 above.
97 :    
98 :     Of course, DEFFREG doesn't actually generate any code. In our model
99 :     of the Alpha, every instruction generates exactly 4 bytes of code
100 :     except the "span-dependent" ones. Therefore, we'll specify DEFFREG
101 :     as a span-dependent instruction whose minimum and maximum sizes are zero.
102 :    
103 :     At the moment, we do not group arithmetic operations into sequences;
104 :     that is, each arithop will be preceded by a single DEFFREG and
105 :     followed by a TRAPB. To avoid the cost of all those TRAPB's, we
106 :     should improve this when we have time. Warning: Don't put more
107 :     than 31 instructions in the sequence, because they're all required
108 :     to write to different destination registers!
109 :    
110 :     What about multiple traps? For example, suppose a sequence of
111 :     instructions produces an Overflow and a Divide-by-Zero exception?
112 :     ML would like to know only about the earliest trap, but the hardware
113 :     will report BOTH traps to the operating system. However, as long
114 :     as the rules above are followed (and the software-completion versions
115 :     of the arithmetic instructions are used), the operating system will
116 :     have enough information to know which instruction produced the
117 :     trap. It is very probable that the operating system will report ONLY
118 :     the earlier trap to the user process, but I'm not sure.
119 :    
120 :     For a hint about what the operating system is doing in its own
121 :     trap-handler (with software completion), see section 6.3.2 of
122 :     "OpenVMS Alpha Software" (Part II of the Alpha Architecture
123 :     Manual). This stuff should apply to Unix (OSF1) as well as VMS.
124 :    
125 :     ****************************************************************)
126 :    
127 :     fun error msg = MLRiscErrorMsg.impossible ("Alpha32." ^ msg)
128 :    
129 :     val itow = Word.fromInt
130 :     val wtoi = Word.toIntX
131 :    
132 :     val emit = F.emitInstr
133 :    
134 :     val zeroR = 31
135 :     val zeroOp = I.REGop zeroR
136 :     val zeroEA = I.Direct zeroR
137 :     val argReg0 = 16
138 :    
139 :     val mem = R.memory
140 :    
141 :     fun cond T.LT = I.CC_LT | cond T.LTU = I.CC_LTU
142 :     | cond T.LE = I.CC_LE | cond T.LEU = I.CC_LEU
143 :     | cond T.EQ = I.CC_EQ | cond T.NEQ = I.CC_NEQ
144 :     | cond T.GE = I.CC_GE | cond T.GEU = I.CC_GEU
145 :     | cond T.GT = I.CC_GT | cond T.GTU = I.CC_GTU
146 :    
147 :     fun swapcc I.CC_LT = I.CC_GT | swapcc I.CC_LTU = I.CC_GTU
148 :     | swapcc I.CC_LE = I.CC_GE | swapcc I.CC_LEU = I.CC_GEU
149 :     | swapcc I.CC_EQ = I.CC_EQ | swapcc I.CC_NEQ = I.CC_NEQ
150 :     | swapcc I.CC_GT = I.CC_LT | swapcc I.CC_GEU = I.CC_LEU
151 :     | swapcc I.CC_GE = I.CC_LE | swapcc _ = error "swapcc"
152 :    
153 :     (* NOTE: stack allocation must be multiples of 16 *)
154 :     fun stackAllocate(n) =
155 :     I.OPERATE{oper=I.SUBL, ra=C.stackptrR, rb=I.IMMop n, rc=C.stackptrR}
156 :    
157 :     fun stackDeallocate(n) =
158 :     I.OPERATE{oper=I.ADDL, ra=C.stackptrR, rb=I.IMMop n, rc=C.stackptrR}
159 :    
160 :     fun loadImmed(n, base, rd) =
161 :     if ~32768 <= n andalso n < 32768 then
162 :     emit(I.LDA{r=rd, b=base, d=I.IMMop n})
163 :     else let
164 :     val w = itow n
165 :     val hi = Word.~>>(w, 0w16)
166 :     val lo = Word.andb(w, 0w65535)
167 :     val (hi', lo') =
168 :     if lo < 0w32768 then (hi, lo) else (hi+0w1, lo-0w65536)
169 :     in
170 :     emit(I.LDA{r=rd, b=base, d=I.IMMop(wtoi lo')});
171 :     emit(I.LDAH{r=rd, b=rd, d=I.IMMop(wtoi hi')})
172 :     end
173 :    
174 :     (* loadImmed32 is used to load int32 and word32 constants.
175 :     * In either case we sign extend the 32-bit value. This is compatible
176 :     * with LDL which sign extends a 32-bit valued memory location.
177 :     *)
178 :     fun loadImmed32 (n, base, rd) = let
179 :     val low = W32.andb(n, 0w65535) (* unsigned low 16 bits *)
180 :     val high = W32.div(n, 0w65536) (* Sign-extend *)
181 :     val (lowsgn, highsgn) =
182 :     if W32.<=(low, 0w32767) then (low, high)
183 :     else (W32.-(low,0w65536), W32.+(high,0w1))
184 :     val highsgn' = W32.andb(highsgn, 0w65535)
185 :     in
186 :     emit(I.LDA{r=rd, b=base, d=I.IMMop(W32.toIntX lowsgn)});
187 :     if (highsgn' = 0w0) then ()
188 :     else if highsgn' < 0w32768 then
189 :     emit(I.LDAH{r=rd, b=rd, d=I.IMMop(W32.toIntX highsgn)})
190 :     else
191 :     emit(I.LDAH{r=rd, b=rd,
192 :     d=I.IMMop(W32.toIntX (W32.-(highsgn, 0w65536)))})
193 :     end
194 :    
195 :    
196 :     fun orderedFArith (exp1, exp2, T.LR) = (fregAction exp1, fregAction exp2)
197 :     | orderedFArith (exp1, exp2, T.RL) = let
198 :     val f2 = fregAction exp2
199 :     in (fregAction exp1, f2)
200 :     end
201 :    
202 :     and stmAction exp = let
203 :     fun fbranch(_, T.FCMP(cc, exp1, exp2, order), lab) = let
204 :     val (f1, f2) = orderedFArith(exp1, exp2, order)
205 :     fun bcc(cmp, br) = let
206 : monnier 106 val tmpR = C.newFreg()
207 : monnier 16 in
208 :     emit(I.DEFFREG(tmpR));
209 :     emit(I.FOPERATE{oper=cmp, fa=f1, fb=f2, fc=tmpR});
210 :     emit(I.TRAPB);
211 :     emit(I.FBRANCH(br, tmpR, lab))
212 :     end
213 :    
214 :     fun fall(cmp1, br1, cmp2, br2) = let
215 : monnier 106 val tmpR1 = C.newFreg()
216 :     val tmpR2 = C.newFreg()
217 : monnier 16 val fallLab = Label.newLabel ""
218 :     in
219 :     emit(I.DEFFREG(tmpR1));
220 :     emit(I.FOPERATE{oper=cmp1, fa=f1, fb=f2, fc=tmpR1});
221 :     emit(I.TRAPB);
222 :     emit(I.FBRANCH(br1, tmpR1, fallLab));
223 :     emit(I.DEFFREG(tmpR2));
224 :     emit(I.FOPERATE{oper=cmp2, fa=f1, fb=f2, fc=tmpR2});
225 :     emit(I.TRAPB);
226 :     emit(I.FBRANCH(br2, tmpR2, lab));
227 :     F.defineLabel fallLab
228 :     end
229 :    
230 :     fun bcc2(cmp1, br1, cmp2, br2) = (bcc(cmp1, br1); bcc(cmp2, br2))
231 :     in
232 :     case cc
233 :     of T.== => bcc(I.CMPTEQ, I.BNE)
234 :     | T.?<> => bcc(I.CMPTEQ, I.BEQ)
235 :     | T.? => bcc(I.CMPTUN, I.BNE)
236 :     | T.<=> => bcc(I.CMPTUN, I.BEQ)
237 :     | T.> => fall(I.CMPTLE, I.BNE, I.CMPTUN, I.BEQ)
238 :     | T.>= => fall(I.CMPTLT, I.BNE, I.CMPTUN, I.BEQ)
239 :     | T.?> => bcc(I.CMPTLE, I.BEQ)
240 :     | T.?>= => bcc(I.CMPTLT, I.BEQ)
241 :     | T.< => bcc(I.CMPTLT, I.BNE)
242 :     | T.<= => bcc(I.CMPTLE, I.BNE)
243 :     | T.?< => bcc2(I.CMPTLT, I.BNE, I.CMPTUN, I.BNE)
244 :     | T.?<= => bcc2(I.CMPTLE, I.BNE, I.CMPTUN, I.BNE)
245 :     | T.<> => fall(I.CMPTEQ, I.BNE, I.CMPTUN, I.BEQ)
246 :     | T.?= => bcc2(I.CMPTEQ, I.BNE, I.CMPTUN, I.BNE)
247 :     end
248 :    
249 :     fun branch(cond, exp1, exp2, lab, order) = let
250 :     fun zapHi r = emit(I.OPERATE{oper=I.ZAP, ra=r, rb=I.IMMop 0xf0, rc=r})
251 : monnier 106 val tmpR = C.newReg()
252 : monnier 16 val (r1, o2) =
253 :     case order
254 :     of T.LR => (regAction exp1, opndAction exp2)
255 :     | T.RL => let val o2' = opndAction exp2
256 :     in
257 :     (regAction(exp1), o2')
258 :     end
259 :     fun emitBr(cmp, br) =
260 :     (emit(I.OPERATE{oper=cmp, ra=r1, rb=o2, rc=tmpR});
261 :     emit(I.BRANCH(br, tmpR, lab)))
262 :     fun emitUnsignedBr(cmp, br) =
263 :     (case (r1, o2)
264 :     of (r1, I.REGop r2) => (zapHi r1; zapHi r2; emitBr(cmp, br))
265 :     | (r1, o2) => (zapHi r1; emitBr(cmp, br))
266 :     (*esac*))
267 :     in
268 :     case cond
269 :     of I.CC_LTU => emitUnsignedBr(I.CMPULT, I.BNE)
270 :     | I.CC_LEU => emitUnsignedBr(I.CMPULE, I.BNE)
271 :     | I.CC_GTU => emitUnsignedBr(I.CMPULE, I.BEQ)
272 :     | I.CC_GEU => emitUnsignedBr(I.CMPULT, I.BEQ)
273 :     | I.CC_LT => emitBr(I.CMPLT, I.BNE)
274 :     | I.CC_LE => emitBr(I.CMPLE, I.BNE)
275 :     | I.CC_GT => emitBr(I.CMPLE, I.BEQ)
276 :     | I.CC_GE => emitBr(I.CMPLT, I.BEQ)
277 :     | I.CC_EQ => emitBr(I.CMPEQ, I.BNE)
278 :     | I.CC_NEQ => emitBr(I.CMPEQ, I.BEQ)
279 :     end
280 : monnier 106 fun copyTmp() = SOME(I.Direct(C.newReg()))
281 :     fun fcopyTmp() = SOME(I.FDirect(C.newFreg()))
282 : monnier 16 in
283 :     case exp
284 :     of T.JMP(T.LABEL(LE.LABEL lab), _) => emit(I.BRANCH(I.BR, zeroR, lab))
285 :     | T.JMP(T.LABEL _, _) => error "JMP(T.LABEL _, _)"
286 :     | T.JMP(exp, labs) =>
287 :     emit(I.JMPL({r=C.asmTmpR, b=regAction exp, d=0}, labs))
288 :     | T.BCC(_, T.CMP(T.NEQ, T.ANDB(exp, T.LI 1), T.LI 0, _), lab) =>
289 :     emit(I.BRANCH(I.BLBS, regAction exp, lab))
290 :     | T.BCC(_, T.CMP(cc, exp, T.LI n, ord), lab) =>
291 :     branch(cond cc, exp, T.LI n, lab, ord)
292 :     | T.BCC(_, T.CMP(cc, T.LI n, exp, ord), lab) =>
293 :     branch(swapcc(cond cc), exp, T.LI n, lab, ord)
294 :     | T.BCC(_, T.CMP(cc, e1, e2, ord), lab) =>
295 :     branch(cond cc, e1, e2, lab, ord)
296 :     | T.BCC(_, e, lab) => emit(I.BRANCH(I.BNE, ccAction e, lab))
297 :     | T.FBCC arg => fbranch arg
298 :     | T.CALL(exp, def, use) => let
299 :     val pv = regAction exp
300 :     val returnPtrR = 26
301 :     fun live([],acc) = acc
302 :     | live(T.GPR(T.REG r)::regs,acc) = live(regs, C.addReg(r,acc))
303 :     | live(T.CCR(T.CC cc)::regs,acc) = live(regs, C.addReg(cc,acc))
304 :     | live(T.FPR(T.FREG f)::regs,acc) = live(regs, C.addFreg(f,acc))
305 :     | live(_::regs, acc) = live(regs, acc)
306 :     in
307 :     emit(I.JSR({r=returnPtrR, b=pv, d=0},
308 :     live(def, C.addReg(returnPtrR, C.empty)),
309 :     live(use, C.addReg(pv, C.empty))))
310 :     end
311 :     | T.RET => emit(I.JMPL({r=zeroR, b=26, d=0}, []))
312 :     | T.STORE8(ea, r, region) => let
313 :     val rs = regAction r
314 :     val (rd, disp) = eaAction ea
315 : monnier 106 val t1 = C.newReg()
316 :     val t2 = C.newReg()
317 :     val t3 = C.newReg()
318 : monnier 16 in
319 :     app emit
320 :     [I.LOAD{ldOp=I.LDQ_U, r=t1, b=rd, d=disp,mem=mem},
321 :     I.LDA{r=t2, b=rd, d=disp},
322 :     I.OPERATE{oper=I.INSBL, ra=rs, rb=I.REGop(t2), rc=t3},
323 :     I.OPERATE{oper=I.MSKBL, ra=t1, rb=I.REGop(t2), rc=t1},
324 :     I.OPERATE{oper=I.BIS, ra=t1, rb=I.REGop(t3), rc=t1},
325 :     I.STORE{stOp=I.STQ_U, r=t1, b=rd, d=disp, mem=mem}]
326 :     end
327 :     | T.STORE32(ea, r, region) => let
328 :     val (b, d) = eaAction ea
329 :     in emit(I.STORE{stOp=I.STL, r=regAction r, b=b, d=d, mem=region})
330 :     end
331 :     | T.STORED(ea, f, region) => let
332 :     val (b, d) = eaAction ea
333 :     in emit(I.FSTORE{stOp=I.STT, r=fregAction f, b=b, d=d, mem=region})
334 :     end
335 :     | T.STORECC(ea, cc, region) => error "stmAction.STORECC"
336 :     | T.MV(rd, exp) => let
337 :     fun move(dst, src) = I.OPERATE{oper=I.BIS, ra=src, rb=zeroOp, rc=dst}
338 :     in
339 :     case exp
340 :     of T.REG(rs) => if rs = rd then () else emit(move(rd, rs))
341 :     | T.LI n => loadImmed(n, zeroR, rd)
342 :     | T.LI32 w => loadImmed32(w, zeroR, rd)
343 :     | _ => let val rs = regActionRd(exp, rd)
344 :     in if rs = rd then () else emit(move(rd,rs))
345 :     end
346 :     (*esac*)
347 :     end
348 :     | T.CCMV(cd, exp) => let
349 :     val cs = case exp of T.CC(r) => r | _ => ccActionCd(exp, cd)
350 :     in
351 :     if cs = cd then ()
352 :     else emit(I.OPERATE{oper=I.BIS, ra=cs, rb=zeroOp, rc=cd})
353 :     end
354 :     | T.FMV(fd, exp) => let
355 :     fun fmove(dst, src) = I.FOPERATE{oper=I.CPYS, fa=src, fb=src, fc=dst}
356 :     in
357 :     case exp
358 :     of T.FREG(fs) =>
359 :     if fs = fd then () else emit(fmove(fd, fs))
360 :     | _ => let
361 :     val fs = fregActionFd(exp, fd)
362 :     in if fs = fd then () else emit(fmove(fd, fs))
363 :     end
364 :     (*esac*)
365 :     end
366 :     | T.COPY(rds as [_], rss) =>
367 :     emit(I.COPY{dst=rds, src=rss, impl=ref NONE, tmp=NONE})
368 :     | T.COPY(rds, rss) =>
369 :     emit(I.COPY{dst=rds, src=rss, impl=ref NONE, tmp=copyTmp()})
370 :     | T.FCOPY(fds as [_], fss)=>
371 :     emit(I.FCOPY{dst=fds, src=fss, impl=ref NONE, tmp=NONE})
372 :     | T.FCOPY(fds, fss) =>
373 :     emit(I.FCOPY{dst=fds, src=fss, impl=ref NONE, tmp=fcopyTmp()})
374 :     end
375 :     and ccAction(T.CC r) = r
376 :     | ccAction(e) = ccActionCd(e, C.newCCreg())
377 :    
378 :     and ccActionCd(T.CC r, _) = r
379 :     (* enough for now ... *)
380 :     | ccActionCd(T.CMP(T.GTU, e1, e2, ord), cd) = let
381 :     val (opnd, reg) =
382 :     case ord
383 :     of T.LR => (opndAction e1, regAction e2)
384 :     | T.RL => let
385 :     val r = regAction e2
386 :     in (opndAction e1, r)
387 :     end
388 :     in
389 :     emit(I.OPERATE{oper=I.CMPULT, ra=reg, rb=opnd, rc=cd});
390 :     cd
391 :     end
392 :     | ccActionCd(T.LOADCC _, _) = error "ccAction"
393 :    
394 :     and opndAction (T.LI value) =
395 :     if value <= 255 andalso value >= 0 then I.IMMop value
396 :     else let
397 : monnier 106 val tmpR = C.newReg()
398 : monnier 16 in
399 :     loadImmed (value, zeroR, tmpR);
400 :     I.REGop tmpR
401 :     end
402 :     | opndAction(T.LI32 value) =
403 :     if Word32.<=(value, 0w255) then I.IMMop (Word32.toInt value)
404 :     else let
405 : monnier 106 val tmpR = C.newReg ()
406 : monnier 16 in
407 :     loadImmed32 (value, zeroR, tmpR);
408 :     I.REGop tmpR
409 :     end
410 :     | opndAction(T.CONST const) = I.CONSTop (const)
411 :     | opndAction exp = I.REGop (regAction exp)
412 :    
413 :     and reduceOpnd(I.IMMop i) = let
414 : monnier 106 val rd = C.newReg()
415 : monnier 16 in loadImmed(i, zeroR, rd); rd
416 :     end
417 :     | reduceOpnd(I.REGop rd) = rd
418 :     | reduceOpnd(I.CONSTop c) = regAction(T.CONST c)
419 :     | reduceOpnd(I.LABop le) = regAction(T.LABEL le)
420 :     | reduceOpnd _ = error "reduceOpnd"
421 :    
422 :     and orderedRR(e1, e2, T.LR) = (regAction e1, regAction e2)
423 :     | orderedRR(e1, e2, T.RL) = let
424 :     val r2 = regAction e2
425 :     in (regAction e1, r2)
426 :     end
427 :    
428 :     and regAction (T.REG r) = r
429 : monnier 106 | regAction exp = regActionRd(exp, C.newReg())
430 : monnier 16
431 :     and arithOperands(e1, e2, T.LR) = (regAction e1, opndAction e2)
432 :     | arithOperands(e1, e2, T.RL) = let
433 :     val opnd = opndAction e2
434 :     in (regAction e1, opnd)
435 :     end
436 :    
437 :     and regActionRd(exp, rd) = let
438 :    
439 :     fun orderedArith(oper, arith, e1, e2, ord) = let
440 :     val (reg, opnd) = arithOperands(e1, e2, ord)
441 :     in
442 :     emit(oper{oper=arith, ra=reg, rb=opnd, rc=rd});
443 :     rd
444 :     end
445 :    
446 :     fun commOrderedArith(oper, arith, e1 ,e2, ord) = let
447 :     fun f(e1 as T.LI _, e2) = orderedArith(oper, arith, e2, e1, ord)
448 :     | f(e1 as T.LI32 _, e2) = orderedArith(oper, arith, e2, e1, ord)
449 :     | f(e1, e2) = orderedArith(oper, arith, e1, e2, ord)
450 :     in
451 :     f(e1, e2)
452 :     end
453 :    
454 :    
455 :     fun orderedArithTrap arg = orderedArith arg before emit(I.TRAPB)
456 :     fun commOrderedArithTrap arg = commOrderedArith arg before emit(I.TRAPB)
457 :    
458 :     fun orderedMullTrap (e1, e2, ord, rd) = let
459 :     val (reg, opnd) = case ord
460 :     of T.LR => (regAction e1, opndAction e2)
461 :     | T.RL => let
462 :     val opnd = opndAction e2
463 :     in
464 :     (regAction e1, opnd)
465 :     end
466 :    
467 :     fun emitMulvImmed (reg, 0, rd) =
468 :     emit(I.LDA{r=rd, b=zeroR, d=I.IMMop 0})
469 :     | emitMulvImmed (reg, 1, rd) =
470 :     emit(I.OPERATE{oper=I.ADDL, ra=reg, rb=zeroOp, rc=rd})
471 :     | emitMulvImmed (reg, multiplier, rd) = let
472 :     fun log2 0w1 = 0 | log2 n = 1 + (log2 (Word.>> (n, 0w1)))
473 :    
474 :     fun exp2 n = Word.<<(0w1, n)
475 :    
476 :     fun bitIsSet (x,n) = Word.andb(x,exp2 n) <> 0w0
477 :    
478 :     fun loop (~1) = ()
479 :     | loop n =
480 :     (if bitIsSet(itow multiplier, itow n) then
481 :     emit(I.OPERATEV{oper=I.ADDLV, ra=reg, rb=I.REGop rd, rc=rd})
482 :     else ();
483 :     if n>0 then
484 :     emit(I.OPERATEV{oper=I.ADDLV, ra=rd, rb=I.REGop rd, rc=rd})
485 :     else ();
486 :     loop (n-1))
487 :     in
488 :     emit(I.OPERATEV{oper=I.ADDLV, ra=reg, rb=I.REGop reg, rc=rd});
489 :     loop ((log2 (itow multiplier)) - 1)
490 :     end
491 :     in
492 :     case opnd
493 :     of (I.IMMop multiplier) => emitMulvImmed (reg, multiplier, rd)
494 :     | _ => emit (I.OPERATEV{oper=I.MULLV , ra=reg, rb=opnd, rc=rd})
495 :     (*esac*);
496 :     emit(I.TRAPB);
497 :     rd
498 :     end
499 :    
500 :     fun opndToWord (T.LI i) = SOME (Word32.fromInt i)
501 :     | opndToWord (T.LI32 w) = SOME w
502 :     | opndToWord _ = NONE
503 :     in
504 :     case exp
505 :     of T.LI n => (loadImmed(n, zeroR, rd); rd)
506 :     | T.LI32 w => (loadImmed32(w, zeroR, rd); rd)
507 :     | T.LABEL le => (emit(I.LDA{r=rd, b=zeroR, d=I.LABop le}); rd)
508 :     | T.CONST c => (emit(I.LDA{r=rd, b=zeroR, d=I.CONSTop(c)}); rd)
509 :     | T.ADD(e, T.LABEL le)=>
510 :     (emit(I.LDA{r=rd, b=regAction(e), d=I.LABop(le)}); rd)
511 :     | T.ADD(T.LI i, e) => (loadImmed(i, regAction e, rd); rd)
512 :     | T.ADD(e, T.LI i) => (loadImmed(i, regAction e, rd); rd)
513 :     | T.ADD(T.LI32 i, e) => (loadImmed32(i, regAction e, rd); rd)
514 :     | T.ADD(e, T.LI32 i) => (loadImmed32(i, regAction e, rd); rd)
515 :     | T.ADD(T.CONST c, e) =>
516 :     (emit(I.LDA{r=rd, b=regAction e, d=I.CONSTop c}); rd)
517 :     | T.ADD(e, T.CONST c) =>
518 :     (emit(I.LDA{r=rd, b=regAction e, d=I.CONSTop c}); rd)
519 :     | T.ADD(e1, e2) => commOrderedArith(I.OPERATE, I.ADDL, e1, e2, T.LR)
520 :     | T.SUB(e1, e2, ord) => orderedArith(I.OPERATE, I.SUBL, e1, e2, ord)
521 :     | T.MULU(e1, e2) => commOrderedArith(I.OPERATE, I.MULL, e1, e2, T.LR)
522 :     | T.ADDT(e1, e2) =>
523 :     commOrderedArithTrap(I.OPERATEV, I.ADDLV, e1, e2, T.LR)
524 :     | T.SUBT(e1, e2, ord) => orderedArithTrap(I.OPERATEV, I.SUBLV, e1, e2, ord)
525 :     | T.MULT(e1, e2) => orderedMullTrap(e1, e2, T.LR, rd)
526 :     | T.ANDB(e1, e2) => let
527 :     fun opndToByteMask (SOME (0wx0:Word32.word)) = SOME 0xf
528 :     | opndToByteMask (SOME 0wx000000ff) = SOME 0xe
529 :     | opndToByteMask (SOME 0wx0000ff00) = SOME 0xd
530 :     | opndToByteMask (SOME 0wx0000ffff) = SOME 0xc
531 :     | opndToByteMask (SOME 0wx00ff0000) = SOME 0xb
532 :     | opndToByteMask (SOME 0wx00ff00ff) = SOME 0xa
533 :     | opndToByteMask (SOME 0wx00ffff00) = SOME 0x9
534 :     | opndToByteMask (SOME 0wx00ffffff) = SOME 0x8
535 :     | opndToByteMask (SOME 0wxff000000) = SOME 0x7
536 :     | opndToByteMask (SOME 0wxff0000ff) = SOME 0x6
537 :     | opndToByteMask (SOME 0wxff00ff00) = SOME 0x5
538 :     | opndToByteMask (SOME 0wxff00ffff) = SOME 0x4
539 :     | opndToByteMask (SOME 0wxffff0000) = SOME 0x3
540 :     | opndToByteMask (SOME 0wxffff00ff) = SOME 0x2
541 :     | opndToByteMask (SOME 0wxffffff00) = SOME 0x1
542 :     | opndToByteMask (SOME 0wxffffffff) = SOME 0x0
543 :     | opndToByteMask _ = NONE
544 :    
545 :     val opndToMask = opndToByteMask o opndToWord
546 :     in
547 :     case (opndToMask e1, opndToMask e2) of
548 :     (SOME mask, _) =>
549 :     orderedArith(I.OPERATE, I.ZAP, e2, T.LI mask, T.LR)
550 :     | (_, SOME mask) =>
551 :     orderedArith(I.OPERATE, I.ZAP, e1, T.LI mask, T.LR)
552 :     | _ => commOrderedArith(I.OPERATE, I.AND, e1, e2, T.LR)
553 :     end
554 :     | T.ORB(e1, e2) => commOrderedArith(I.OPERATE, I.BIS, e1, e2, T.LR)
555 :     | T.XORB(e1, e2) => commOrderedArith(I.OPERATE, I.XOR, e1, e2, T.LR)
556 :     | T.SLL(e1, e2, ord) =>
557 :     (case opndToWord e2 of
558 :     SOME 0w1 => let val r = T.REG (regAction e1)
559 :     in orderedArith(I.OPERATE, I.ADDL, r, r, T.LR) end
560 :     | SOME 0w2 => orderedArith(I.OPERATE, I.S4ADDL, e1, T.LI 0, T.LR)
561 :     | SOME 0w3 => orderedArith(I.OPERATE, I.S8ADDL, e1, T.LI 0, T.LR)
562 :     | _ => (orderedArith(I.OPERATE, I.SLL, e1, e2, T.LR);
563 :     emit(I.OPERATE{oper=I.SGNXL, ra=rd, rb=zeroOp, rc=rd});
564 :     rd)
565 :     (*esac*))
566 :     | T.SRA(e1, e2, ord) => let
567 :     val (reg, opnd) = (regAction e1, opndAction e2)
568 :     in
569 :     (* sign extend longword argument *)
570 :     emit(I.OPERATE{oper=I.SGNXL, ra=reg, rb=zeroOp, rc=reg});
571 :     emit(I.OPERATE{oper=I.SRA, ra=reg, rb=opnd, rc=rd});
572 :     rd
573 :     end
574 :     | T.SRL(e1, e2, ord) => let
575 :     val (reg, opnd) = (regAction e1, opndAction e2)
576 :     in
577 :     emit(I.OPERATE{oper=I.ZAP, ra=reg, rb=I.IMMop 0xf0, rc=reg});
578 :     emit(I.OPERATE{oper=I.SRL, ra=reg, rb=opnd, rc=rd});
579 :     rd
580 :     end
581 :     | T.DIVT arg => let
582 :     val (reg, opnd) = arithOperands arg
583 :     in
584 :     app emit (PseudoInstrs.divl({ra=reg, rb=opnd, rc=rd}, reduceOpnd));
585 :     rd
586 :     end
587 :     | T.DIVU arg => let
588 :     val (reg, opnd) = arithOperands arg
589 :     in
590 :     app emit (PseudoInstrs.divlu({ra=reg, rb=opnd, rc=rd}, reduceOpnd));
591 :     rd
592 :     end
593 :     | T.LOAD32(exp, region) => let
594 :     val (b, d) = eaAction exp
595 :     in emit(I.LOAD{ldOp=I.LDL, r=rd, b=b, d=d, mem=region}); rd
596 :     end
597 :     (* Load and sign-extend a byte from a non-aligned address *)
598 :     | T.LOAD8(exp, region) => let
599 : monnier 106 val tmpR0 = C.newReg()
600 :     val tmpR1 = C.newReg()
601 : monnier 16 val (rt, disp) = eaAction exp
602 :     in
603 :     emit(I.LOAD{ldOp=I.LDQ_U, r=tmpR0, b=rt, d=disp, mem=mem});
604 :     emit(I.LDA{r=tmpR1, b=rt, d=disp});
605 :     emit(I.OPERATE{oper=I.EXTBL, ra=tmpR0, rb=I.REGop tmpR1, rc=rd});
606 :     rd
607 :     end
608 :     | T.SEQ(e1, e2) => (stmAction e1; regAction e2)
609 :     | _ => error "regAction"
610 :     end (* regActionRd *)
611 :    
612 :     and eaAction exp = let
613 :     fun makeEA(r, n) =
614 :     if ~32768 <= n andalso n <= 32767 then (r, I.IMMop n)
615 :     else let
616 : monnier 106 val tmpR = C.newReg()
617 : monnier 16 val low = wtoi(Word.andb(itow n, 0w65535))(* unsigned low 16 bits *)
618 :     val high = n div 65536
619 :     val (lowsgn, highsgn) = (* Sign-extend *)
620 :     if low <= 32767 then (low, high) else (low -65536, high+1)
621 :     in
622 :     (emit(I.LDAH{r=tmpR, b=r, d=I.IMMop highsgn});
623 :     (tmpR, I.IMMop lowsgn))
624 :     end
625 :     in
626 :     case exp
627 :     of T.ADD(exp, T.LI n) => makeEA(regAction exp, n)
628 :     | T.ADD(T.LI n, exp) => makeEA(regAction exp, n)
629 :     | T.ADD(T.CONST c, exp) => (regAction exp, I.CONSTop(c))
630 :     | T.ADD(exp, T.CONST c) => (regAction exp, I.CONSTop(c))
631 :     | T.SUB(exp, T.LI n, _) => makeEA(regAction exp, ~n)
632 :     | exp => makeEA(regAction exp, 0)
633 :     end (* eaAction *)
634 :    
635 :     and fregAction (T.FREG f) = f
636 : monnier 106 | fregAction exp = fregActionFd(exp, C.newFreg())
637 : monnier 16
638 :     and fregActionFd(exp, fd) = let
639 :     (* macho comment goes here *)
640 :     fun doFloatArith(farith, e1, e2, fd, order) = let
641 :     val (f1, f2) = orderedFArith(e1, e2, order)
642 :     in
643 :     emit(I.DEFFREG fd);
644 :     emit(I.FOPERATEV{oper=farith, fa=f1, fb=f2, fc=fd});
645 :     emit(I.TRAPB);
646 :     fd
647 :     end
648 :     in
649 :     case exp
650 :     of T.FREG f => f
651 :     | T.FADDD(e1, e2) => doFloatArith(I.ADDT, e1, e2, fd, T.LR)
652 :     | T.FMULD(e1, e2) => doFloatArith(I.MULT, e1, e2, fd, T.LR)
653 :     | T.FSUBD(e1, e2, ord) => doFloatArith(I.SUBT, e1, e2, fd, ord)
654 :     | T.FDIVD(e1, e2, ord) => doFloatArith(I.DIVT, e1, e2, fd, ord)
655 :     | T.FABSD exp =>
656 :     (emit(I.FOPERATE{oper=I.CPYS, fa=zeroR, fb=fregAction exp, fc=fd}); fd)
657 :     | T.FNEGD exp => let
658 :     val fs = fregAction exp
659 :     in
660 :     emit(I.FOPERATE{oper=I.CPYSN, fa=fs, fb=fs, fc=fd}); fd
661 :     end
662 :     | T.CVTI2D exp => let
663 :     val opnd = opndAction exp
664 :     in
665 :     app emit (PseudoInstrs.cvti2d({opnd=opnd, fd=fd},reduceOpnd));
666 :     fd
667 :     end
668 :     | T.LOADD(exp, region) => let
669 :     val (b, d) = eaAction exp
670 :     in emit(I.FLOAD{ldOp=I.LDT, r=fd, b=b, d=d, mem=region}); fd
671 :     end
672 :     | T.FSEQ(e1, e2) => (stmAction e1; fregAction e2)
673 :     end
674 :    
675 :     fun mltreeComp mltree = let
676 :     (* condition code registers are mapped onto general registers *)
677 :     fun cc (T.CCR(T.CC cc)) = T.GPR(T.REG cc)
678 :     | cc x = x
679 :    
680 :     fun mltc(T.PSEUDO_OP pOp) = F.pseudoOp pOp
681 :     | mltc(T.DEFINELABEL lab) = F.defineLabel lab
682 :     | mltc(T.ENTRYLABEL lab) = F.entryLabel lab
683 :     | mltc(T.ORDERED mlts) = F.ordered mlts
684 :     | mltc(T.BEGINCLUSTER) = F.beginCluster()
685 :     | mltc(T.CODE stms) = app stmAction stms
686 :     | mltc(T.ENDCLUSTER regmap)= F.endCluster regmap
687 :     | mltc(T.ESCAPEBLOCK regs) = F.exitBlock (map cc regs)
688 :     in mltc mltree
689 :     end
690 :    
691 :     val mlriscComp = stmAction
692 :     end
693 :    
694 :    
695 :     (*
696 : monnier 113 * $Log$
697 : monnier 16 *)

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0