Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Annotation of /sml/trunk/src/MLRISC/alpha32/alpha32.sml
ViewVC logotype

Annotation of /sml/trunk/src/MLRISC/alpha32/alpha32.sml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 16 - (view) (download)

1 : monnier 16 (* alpha32.sml
2 :     *
3 :     * COPYRIGHT (c) 1995 AT&T Bell Laboratories.
4 :     *
5 :     * generates machine code from the mltree.
6 :     *
7 :     *)
8 :    
9 :     (** NOTE: f29 and f30 are used as temporaries.
10 :     ** r31 is always zero.
11 :     **)
12 :     functor Alpha32
13 :     (structure Alpha32Instr : ALPHA32INSTR
14 :     structure Alpha32MLTree : MLTREE
15 :     structure Flowgen : FLOWGRAPH_GEN
16 :     structure PseudoInstrs : ALPHA32_PSEUDO_INSTR
17 :     sharing Alpha32Instr.Region = Alpha32MLTree.Region
18 :     sharing Flowgen.I = PseudoInstrs.I = Alpha32Instr
19 :     sharing Flowgen.T=Alpha32MLTree
20 :     sharing Alpha32MLTree.Constant = Alpha32Instr.Constant) : MLTREECOMP =
21 :     struct
22 :     structure F = Flowgen
23 :     structure T = Alpha32MLTree
24 :     structure R = Alpha32MLTree.Region
25 :     structure I = Alpha32Instr
26 :     structure C = Alpha32Instr.C
27 :     structure LE = LabelExp
28 :     structure W32 = Word32
29 :     (*********************************************************
30 :    
31 :     Trap Shadows, Floating Exceptions, and Denormalized
32 :     Numbers on the DEC Alpha
33 :    
34 :     Andrew W. Appel and Lal George
35 :     Nov 28, 1995
36 :    
37 :     See section 4.7.5.1 of the Alpha Architecture Reference Manual.
38 :    
39 :     The Alpha has imprecise exceptions, meaning that if a floating
40 :     point instruction raises an IEEE exception, the exception may
41 :     not interrupt the processor until several successive instructions have
42 :     completed. ML, on the other hand, may want a "precise" model
43 :     of floating point exceptions.
44 :    
45 :     Furthermore, the Alpha hardware does not support denormalized numbers
46 :     (for "gradual underflow"). Instead, underflow always rounds to zero.
47 :     However, each floating operation (add, mult, etc.) has a trapping
48 :     variant that will raise an exception (imprecisely, of course) on
49 :     underflow; in that case, the instruction will produce a zero result
50 :     AND an exception will occur. In fact, there are several variants
51 :     of each instruction; three variants of MULT are:
52 :    
53 :     MULT s1,s2,d truncate denormalized result to zero; no exception
54 :     MULT/U s1,s2,d truncate denormalized result to zero; raise UNDERFLOW
55 :     MULT/SU s1,s2,d software completion, producing denormalized result
56 :    
57 :     The hardware treats the MULT/U and MULT/SU instructions identically,
58 :     truncating a denormalized result to zero and raising the UNDERFLOW
59 :     exception. But the operating system, on an UNDERFLOW exception,
60 :     examines the faulting instruction to see if it's an /SU form, and if so,
61 :     recalculates s1*s2, puts the right answer in d, and continues,
62 :     all without invoking the user's signal handler.
63 :    
64 :     Because most machines compute with denormalized numbers in hardware,
65 :     to maximize portability of SML programs, we use the MULT/SU form.
66 :     (and ADD/SU, SUB/SU, etc.) But to use this form successfully,
67 :     certain rules have to be followed. Basically, d cannot be the same
68 :     register as s1 or s2, because the opsys needs to be able to
69 :     recalculate the operation using the original contents of s1 and s2,
70 :     and the MULT/SU instruction will overwrite d even if it traps.
71 :    
72 :     More generally, we may want to have a sequence of floating-point
73 :     instructions. The rules for such a sequence are:
74 :    
75 :     1. The sequence should end with a TRAPB (trap barrier) instruction.
76 :     (This could be relaxed somewhat, but certainly a TRAPB would
77 :     be a good idea sometime before the next branch instruction or
78 :     update of an ML reference variable, or any other ML side effect.)
79 :     2. No instruction in the sequence should destroy any operand of itself
80 :     or of any previous instruction in the sequence.
81 :     3. No two instructions in the sequence should write the same destination
82 :     register.
83 :    
84 :     We can achieve these conditions by the following trick in the
85 :     Alpha code generator. Each instruction in the sequence will write
86 :     to a different temporary; this is guaranteed by the translation from
87 :     ML-RISC. At the beginning of the sequence, we will put a special
88 :     pseudo-instruction (we call it DEFFREG) that "defines" the destination
89 :     register of the arithmetic instruction. If there are K arithmetic
90 :     instructions in the sequence, then we'll insert K DEFFREG instructions
91 :     all at the beginning of the sequence.
92 :     Then, each arithop will not only "define" its destination temporary
93 :     but will "use" it as well. When all these instructions are fed to
94 :     the liveness analyzer, the resulting interference graph will then
95 :     have inteference edges satisfying conditions 2 and 3 above.
96 :    
97 :     Of course, DEFFREG doesn't actually generate any code. In our model
98 :     of the Alpha, every instruction generates exactly 4 bytes of code
99 :     except the "span-dependent" ones. Therefore, we'll specify DEFFREG
100 :     as a span-dependent instruction whose minimum and maximum sizes are zero.
101 :    
102 :     At the moment, we do not group arithmetic operations into sequences;
103 :     that is, each arithop will be preceded by a single DEFFREG and
104 :     followed by a TRAPB. To avoid the cost of all those TRAPB's, we
105 :     should improve this when we have time. Warning: Don't put more
106 :     than 31 instructions in the sequence, because they're all required
107 :     to write to different destination registers!
108 :    
109 :     What about multiple traps? For example, suppose a sequence of
110 :     instructions produces an Overflow and a Divide-by-Zero exception?
111 :     ML would like to know only about the earliest trap, but the hardware
112 :     will report BOTH traps to the operating system. However, as long
113 :     as the rules above are followed (and the software-completion versions
114 :     of the arithmetic instructions are used), the operating system will
115 :     have enough information to know which instruction produced the
116 :     trap. It is very probable that the operating system will report ONLY
117 :     the earlier trap to the user process, but I'm not sure.
118 :    
119 :     For a hint about what the operating system is doing in its own
120 :     trap-handler (with software completion), see section 6.3.2 of
121 :     "OpenVMS Alpha Software" (Part II of the Alpha Architecture
122 :     Manual). This stuff should apply to Unix (OSF1) as well as VMS.
123 :    
124 :     ****************************************************************)
125 :    
126 :     fun error msg = MLRiscErrorMsg.impossible ("Alpha32." ^ msg)
127 :    
128 :     val itow = Word.fromInt
129 :     val wtoi = Word.toIntX
130 :    
131 :     val emit = F.emitInstr
132 :    
133 :     fun newReg () = C.newReg()
134 :     fun newFreg() = C.newFreg()
135 :    
136 :     val zeroR = 31
137 :     val zeroOp = I.REGop zeroR
138 :     val zeroEA = I.Direct zeroR
139 :     val argReg0 = 16
140 :    
141 :     val mem = R.memory
142 :    
143 :     fun cond T.LT = I.CC_LT | cond T.LTU = I.CC_LTU
144 :     | cond T.LE = I.CC_LE | cond T.LEU = I.CC_LEU
145 :     | cond T.EQ = I.CC_EQ | cond T.NEQ = I.CC_NEQ
146 :     | cond T.GE = I.CC_GE | cond T.GEU = I.CC_GEU
147 :     | cond T.GT = I.CC_GT | cond T.GTU = I.CC_GTU
148 :    
149 :     fun swapcc I.CC_LT = I.CC_GT | swapcc I.CC_LTU = I.CC_GTU
150 :     | swapcc I.CC_LE = I.CC_GE | swapcc I.CC_LEU = I.CC_GEU
151 :     | swapcc I.CC_EQ = I.CC_EQ | swapcc I.CC_NEQ = I.CC_NEQ
152 :     | swapcc I.CC_GT = I.CC_LT | swapcc I.CC_GEU = I.CC_LEU
153 :     | swapcc I.CC_GE = I.CC_LE | swapcc _ = error "swapcc"
154 :    
155 :     (* NOTE: stack allocation must be multiples of 16 *)
156 :     fun stackAllocate(n) =
157 :     I.OPERATE{oper=I.SUBL, ra=C.stackptrR, rb=I.IMMop n, rc=C.stackptrR}
158 :    
159 :     fun stackDeallocate(n) =
160 :     I.OPERATE{oper=I.ADDL, ra=C.stackptrR, rb=I.IMMop n, rc=C.stackptrR}
161 :    
162 :     fun loadImmed(n, base, rd) =
163 :     if ~32768 <= n andalso n < 32768 then
164 :     emit(I.LDA{r=rd, b=base, d=I.IMMop n})
165 :     else let
166 :     val w = itow n
167 :     val hi = Word.~>>(w, 0w16)
168 :     val lo = Word.andb(w, 0w65535)
169 :     val (hi', lo') =
170 :     if lo < 0w32768 then (hi, lo) else (hi+0w1, lo-0w65536)
171 :     in
172 :     emit(I.LDA{r=rd, b=base, d=I.IMMop(wtoi lo')});
173 :     emit(I.LDAH{r=rd, b=rd, d=I.IMMop(wtoi hi')})
174 :     end
175 :    
176 :     (* loadImmed32 is used to load int32 and word32 constants.
177 :     * In either case we sign extend the 32-bit value. This is compatible
178 :     * with LDL which sign extends a 32-bit valued memory location.
179 :     *)
180 :     fun loadImmed32 (n, base, rd) = let
181 :     val low = W32.andb(n, 0w65535) (* unsigned low 16 bits *)
182 :     val high = W32.div(n, 0w65536) (* Sign-extend *)
183 :     val (lowsgn, highsgn) =
184 :     if W32.<=(low, 0w32767) then (low, high)
185 :     else (W32.-(low,0w65536), W32.+(high,0w1))
186 :     val highsgn' = W32.andb(highsgn, 0w65535)
187 :     in
188 :     emit(I.LDA{r=rd, b=base, d=I.IMMop(W32.toIntX lowsgn)});
189 :     if (highsgn' = 0w0) then ()
190 :     else if highsgn' < 0w32768 then
191 :     emit(I.LDAH{r=rd, b=rd, d=I.IMMop(W32.toIntX highsgn)})
192 :     else
193 :     emit(I.LDAH{r=rd, b=rd,
194 :     d=I.IMMop(W32.toIntX (W32.-(highsgn, 0w65536)))})
195 :     end
196 :    
197 :    
198 :     fun orderedFArith (exp1, exp2, T.LR) = (fregAction exp1, fregAction exp2)
199 :     | orderedFArith (exp1, exp2, T.RL) = let
200 :     val f2 = fregAction exp2
201 :     in (fregAction exp1, f2)
202 :     end
203 :    
204 :     and stmAction exp = let
205 :     fun fbranch(_, T.FCMP(cc, exp1, exp2, order), lab) = let
206 :     val (f1, f2) = orderedFArith(exp1, exp2, order)
207 :     fun bcc(cmp, br) = let
208 :     val tmpR = newFreg()
209 :     in
210 :     emit(I.DEFFREG(tmpR));
211 :     emit(I.FOPERATE{oper=cmp, fa=f1, fb=f2, fc=tmpR});
212 :     emit(I.TRAPB);
213 :     emit(I.FBRANCH(br, tmpR, lab))
214 :     end
215 :    
216 :     fun fall(cmp1, br1, cmp2, br2) = let
217 :     val tmpR1 = newFreg()
218 :     val tmpR2 = newFreg()
219 :     val fallLab = Label.newLabel ""
220 :     in
221 :     emit(I.DEFFREG(tmpR1));
222 :     emit(I.FOPERATE{oper=cmp1, fa=f1, fb=f2, fc=tmpR1});
223 :     emit(I.TRAPB);
224 :     emit(I.FBRANCH(br1, tmpR1, fallLab));
225 :     emit(I.DEFFREG(tmpR2));
226 :     emit(I.FOPERATE{oper=cmp2, fa=f1, fb=f2, fc=tmpR2});
227 :     emit(I.TRAPB);
228 :     emit(I.FBRANCH(br2, tmpR2, lab));
229 :     F.defineLabel fallLab
230 :     end
231 :    
232 :     fun bcc2(cmp1, br1, cmp2, br2) = (bcc(cmp1, br1); bcc(cmp2, br2))
233 :     in
234 :     case cc
235 :     of T.== => bcc(I.CMPTEQ, I.BNE)
236 :     | T.?<> => bcc(I.CMPTEQ, I.BEQ)
237 :     | T.? => bcc(I.CMPTUN, I.BNE)
238 :     | T.<=> => bcc(I.CMPTUN, I.BEQ)
239 :     | T.> => fall(I.CMPTLE, I.BNE, I.CMPTUN, I.BEQ)
240 :     | T.>= => fall(I.CMPTLT, I.BNE, I.CMPTUN, I.BEQ)
241 :     | T.?> => bcc(I.CMPTLE, I.BEQ)
242 :     | T.?>= => bcc(I.CMPTLT, I.BEQ)
243 :     | T.< => bcc(I.CMPTLT, I.BNE)
244 :     | T.<= => bcc(I.CMPTLE, I.BNE)
245 :     | T.?< => bcc2(I.CMPTLT, I.BNE, I.CMPTUN, I.BNE)
246 :     | T.?<= => bcc2(I.CMPTLE, I.BNE, I.CMPTUN, I.BNE)
247 :     | T.<> => fall(I.CMPTEQ, I.BNE, I.CMPTUN, I.BEQ)
248 :     | T.?= => bcc2(I.CMPTEQ, I.BNE, I.CMPTUN, I.BNE)
249 :     end
250 :    
251 :     fun branch(cond, exp1, exp2, lab, order) = let
252 :     fun zapHi r = emit(I.OPERATE{oper=I.ZAP, ra=r, rb=I.IMMop 0xf0, rc=r})
253 :     val tmpR = newReg()
254 :     val (r1, o2) =
255 :     case order
256 :     of T.LR => (regAction exp1, opndAction exp2)
257 :     | T.RL => let val o2' = opndAction exp2
258 :     in
259 :     (regAction(exp1), o2')
260 :     end
261 :     fun emitBr(cmp, br) =
262 :     (emit(I.OPERATE{oper=cmp, ra=r1, rb=o2, rc=tmpR});
263 :     emit(I.BRANCH(br, tmpR, lab)))
264 :     fun emitUnsignedBr(cmp, br) =
265 :     (case (r1, o2)
266 :     of (r1, I.REGop r2) => (zapHi r1; zapHi r2; emitBr(cmp, br))
267 :     | (r1, o2) => (zapHi r1; emitBr(cmp, br))
268 :     (*esac*))
269 :     in
270 :     case cond
271 :     of I.CC_LTU => emitUnsignedBr(I.CMPULT, I.BNE)
272 :     | I.CC_LEU => emitUnsignedBr(I.CMPULE, I.BNE)
273 :     | I.CC_GTU => emitUnsignedBr(I.CMPULE, I.BEQ)
274 :     | I.CC_GEU => emitUnsignedBr(I.CMPULT, I.BEQ)
275 :     | I.CC_LT => emitBr(I.CMPLT, I.BNE)
276 :     | I.CC_LE => emitBr(I.CMPLE, I.BNE)
277 :     | I.CC_GT => emitBr(I.CMPLE, I.BEQ)
278 :     | I.CC_GE => emitBr(I.CMPLT, I.BEQ)
279 :     | I.CC_EQ => emitBr(I.CMPEQ, I.BNE)
280 :     | I.CC_NEQ => emitBr(I.CMPEQ, I.BEQ)
281 :     end
282 :     fun copyTmp() = SOME(I.Direct(newReg()))
283 :     fun fcopyTmp() = SOME(I.FDirect(newFreg()))
284 :     in
285 :     case exp
286 :     of T.JMP(T.LABEL(LE.LABEL lab), _) => emit(I.BRANCH(I.BR, zeroR, lab))
287 :     | T.JMP(T.LABEL _, _) => error "JMP(T.LABEL _, _)"
288 :     | T.JMP(exp, labs) =>
289 :     emit(I.JMPL({r=C.asmTmpR, b=regAction exp, d=0}, labs))
290 :     | T.BCC(_, T.CMP(T.NEQ, T.ANDB(exp, T.LI 1), T.LI 0, _), lab) =>
291 :     emit(I.BRANCH(I.BLBS, regAction exp, lab))
292 :     | T.BCC(_, T.CMP(cc, exp, T.LI n, ord), lab) =>
293 :     branch(cond cc, exp, T.LI n, lab, ord)
294 :     | T.BCC(_, T.CMP(cc, T.LI n, exp, ord), lab) =>
295 :     branch(swapcc(cond cc), exp, T.LI n, lab, ord)
296 :     | T.BCC(_, T.CMP(cc, e1, e2, ord), lab) =>
297 :     branch(cond cc, e1, e2, lab, ord)
298 :     | T.BCC(_, e, lab) => emit(I.BRANCH(I.BNE, ccAction e, lab))
299 :     | T.FBCC arg => fbranch arg
300 :     | T.CALL(exp, def, use) => let
301 :     val pv = regAction exp
302 :     val returnPtrR = 26
303 :     fun live([],acc) = acc
304 :     | live(T.GPR(T.REG r)::regs,acc) = live(regs, C.addReg(r,acc))
305 :     | live(T.CCR(T.CC cc)::regs,acc) = live(regs, C.addReg(cc,acc))
306 :     | live(T.FPR(T.FREG f)::regs,acc) = live(regs, C.addFreg(f,acc))
307 :     | live(_::regs, acc) = live(regs, acc)
308 :     in
309 :     emit(I.JSR({r=returnPtrR, b=pv, d=0},
310 :     live(def, C.addReg(returnPtrR, C.empty)),
311 :     live(use, C.addReg(pv, C.empty))))
312 :     end
313 :     | T.RET => emit(I.JMPL({r=zeroR, b=26, d=0}, []))
314 :     | T.STORE8(ea, r, region) => let
315 :     val rs = regAction r
316 :     val (rd, disp) = eaAction ea
317 :     val t1 = newReg()
318 :     val t2 = newReg()
319 :     val t3 = newReg()
320 :     in
321 :     app emit
322 :     [I.LOAD{ldOp=I.LDQ_U, r=t1, b=rd, d=disp,mem=mem},
323 :     I.LDA{r=t2, b=rd, d=disp},
324 :     I.OPERATE{oper=I.INSBL, ra=rs, rb=I.REGop(t2), rc=t3},
325 :     I.OPERATE{oper=I.MSKBL, ra=t1, rb=I.REGop(t2), rc=t1},
326 :     I.OPERATE{oper=I.BIS, ra=t1, rb=I.REGop(t3), rc=t1},
327 :     I.STORE{stOp=I.STQ_U, r=t1, b=rd, d=disp, mem=mem}]
328 :     end
329 :     | T.STORE32(ea, r, region) => let
330 :     val (b, d) = eaAction ea
331 :     in emit(I.STORE{stOp=I.STL, r=regAction r, b=b, d=d, mem=region})
332 :     end
333 :     | T.STORED(ea, f, region) => let
334 :     val (b, d) = eaAction ea
335 :     in emit(I.FSTORE{stOp=I.STT, r=fregAction f, b=b, d=d, mem=region})
336 :     end
337 :     | T.STORECC(ea, cc, region) => error "stmAction.STORECC"
338 :     | T.MV(rd, exp) => let
339 :     fun move(dst, src) = I.OPERATE{oper=I.BIS, ra=src, rb=zeroOp, rc=dst}
340 :     in
341 :     case exp
342 :     of T.REG(rs) => if rs = rd then () else emit(move(rd, rs))
343 :     | T.LI n => loadImmed(n, zeroR, rd)
344 :     | T.LI32 w => loadImmed32(w, zeroR, rd)
345 :     | _ => let val rs = regActionRd(exp, rd)
346 :     in if rs = rd then () else emit(move(rd,rs))
347 :     end
348 :     (*esac*)
349 :     end
350 :     | T.CCMV(cd, exp) => let
351 :     val cs = case exp of T.CC(r) => r | _ => ccActionCd(exp, cd)
352 :     in
353 :     if cs = cd then ()
354 :     else emit(I.OPERATE{oper=I.BIS, ra=cs, rb=zeroOp, rc=cd})
355 :     end
356 :     | T.FMV(fd, exp) => let
357 :     fun fmove(dst, src) = I.FOPERATE{oper=I.CPYS, fa=src, fb=src, fc=dst}
358 :     in
359 :     case exp
360 :     of T.FREG(fs) =>
361 :     if fs = fd then () else emit(fmove(fd, fs))
362 :     | _ => let
363 :     val fs = fregActionFd(exp, fd)
364 :     in if fs = fd then () else emit(fmove(fd, fs))
365 :     end
366 :     (*esac*)
367 :     end
368 :     | T.COPY(rds as [_], rss) =>
369 :     emit(I.COPY{dst=rds, src=rss, impl=ref NONE, tmp=NONE})
370 :     | T.COPY(rds, rss) =>
371 :     emit(I.COPY{dst=rds, src=rss, impl=ref NONE, tmp=copyTmp()})
372 :     | T.FCOPY(fds as [_], fss)=>
373 :     emit(I.FCOPY{dst=fds, src=fss, impl=ref NONE, tmp=NONE})
374 :     | T.FCOPY(fds, fss) =>
375 :     emit(I.FCOPY{dst=fds, src=fss, impl=ref NONE, tmp=fcopyTmp()})
376 :     end
377 :     and ccAction(T.CC r) = r
378 :     | ccAction(e) = ccActionCd(e, C.newCCreg())
379 :    
380 :     and ccActionCd(T.CC r, _) = r
381 :     (* enough for now ... *)
382 :     | ccActionCd(T.CMP(T.GTU, e1, e2, ord), cd) = let
383 :     val (opnd, reg) =
384 :     case ord
385 :     of T.LR => (opndAction e1, regAction e2)
386 :     | T.RL => let
387 :     val r = regAction e2
388 :     in (opndAction e1, r)
389 :     end
390 :     in
391 :     emit(I.OPERATE{oper=I.CMPULT, ra=reg, rb=opnd, rc=cd});
392 :     cd
393 :     end
394 :     | ccActionCd(T.LOADCC _, _) = error "ccAction"
395 :    
396 :     and opndAction (T.LI value) =
397 :     if value <= 255 andalso value >= 0 then I.IMMop value
398 :     else let
399 :     val tmpR = newReg()
400 :     in
401 :     loadImmed (value, zeroR, tmpR);
402 :     I.REGop tmpR
403 :     end
404 :     | opndAction(T.LI32 value) =
405 :     if Word32.<=(value, 0w255) then I.IMMop (Word32.toInt value)
406 :     else let
407 :     val tmpR = newReg ()
408 :     in
409 :     loadImmed32 (value, zeroR, tmpR);
410 :     I.REGop tmpR
411 :     end
412 :     | opndAction(T.CONST const) = I.CONSTop (const)
413 :     | opndAction exp = I.REGop (regAction exp)
414 :    
415 :     and reduceOpnd(I.IMMop i) = let
416 :     val rd = newReg()
417 :     in loadImmed(i, zeroR, rd); rd
418 :     end
419 :     | reduceOpnd(I.REGop rd) = rd
420 :     | reduceOpnd(I.CONSTop c) = regAction(T.CONST c)
421 :     | reduceOpnd(I.LABop le) = regAction(T.LABEL le)
422 :     | reduceOpnd _ = error "reduceOpnd"
423 :    
424 :     and orderedRR(e1, e2, T.LR) = (regAction e1, regAction e2)
425 :     | orderedRR(e1, e2, T.RL) = let
426 :     val r2 = regAction e2
427 :     in (regAction e1, r2)
428 :     end
429 :    
430 :     and regAction (T.REG r) = r
431 :     | regAction exp = regActionRd(exp, newReg())
432 :    
433 :     and arithOperands(e1, e2, T.LR) = (regAction e1, opndAction e2)
434 :     | arithOperands(e1, e2, T.RL) = let
435 :     val opnd = opndAction e2
436 :     in (regAction e1, opnd)
437 :     end
438 :    
439 :     and regActionRd(exp, rd) = let
440 :    
441 :     fun orderedArith(oper, arith, e1, e2, ord) = let
442 :     val (reg, opnd) = arithOperands(e1, e2, ord)
443 :     in
444 :     emit(oper{oper=arith, ra=reg, rb=opnd, rc=rd});
445 :     rd
446 :     end
447 :    
448 :     fun commOrderedArith(oper, arith, e1 ,e2, ord) = let
449 :     fun f(e1 as T.LI _, e2) = orderedArith(oper, arith, e2, e1, ord)
450 :     | f(e1 as T.LI32 _, e2) = orderedArith(oper, arith, e2, e1, ord)
451 :     | f(e1, e2) = orderedArith(oper, arith, e1, e2, ord)
452 :     in
453 :     f(e1, e2)
454 :     end
455 :    
456 :    
457 :     fun orderedArithTrap arg = orderedArith arg before emit(I.TRAPB)
458 :     fun commOrderedArithTrap arg = commOrderedArith arg before emit(I.TRAPB)
459 :    
460 :     fun orderedMullTrap (e1, e2, ord, rd) = let
461 :     val (reg, opnd) = case ord
462 :     of T.LR => (regAction e1, opndAction e2)
463 :     | T.RL => let
464 :     val opnd = opndAction e2
465 :     in
466 :     (regAction e1, opnd)
467 :     end
468 :    
469 :     fun emitMulvImmed (reg, 0, rd) =
470 :     emit(I.LDA{r=rd, b=zeroR, d=I.IMMop 0})
471 :     | emitMulvImmed (reg, 1, rd) =
472 :     emit(I.OPERATE{oper=I.ADDL, ra=reg, rb=zeroOp, rc=rd})
473 :     | emitMulvImmed (reg, multiplier, rd) = let
474 :     fun log2 0w1 = 0 | log2 n = 1 + (log2 (Word.>> (n, 0w1)))
475 :    
476 :     fun exp2 n = Word.<<(0w1, n)
477 :    
478 :     fun bitIsSet (x,n) = Word.andb(x,exp2 n) <> 0w0
479 :    
480 :     fun loop (~1) = ()
481 :     | loop n =
482 :     (if bitIsSet(itow multiplier, itow n) then
483 :     emit(I.OPERATEV{oper=I.ADDLV, ra=reg, rb=I.REGop rd, rc=rd})
484 :     else ();
485 :     if n>0 then
486 :     emit(I.OPERATEV{oper=I.ADDLV, ra=rd, rb=I.REGop rd, rc=rd})
487 :     else ();
488 :     loop (n-1))
489 :     in
490 :     emit(I.OPERATEV{oper=I.ADDLV, ra=reg, rb=I.REGop reg, rc=rd});
491 :     loop ((log2 (itow multiplier)) - 1)
492 :     end
493 :     in
494 :     case opnd
495 :     of (I.IMMop multiplier) => emitMulvImmed (reg, multiplier, rd)
496 :     | _ => emit (I.OPERATEV{oper=I.MULLV , ra=reg, rb=opnd, rc=rd})
497 :     (*esac*);
498 :     emit(I.TRAPB);
499 :     rd
500 :     end
501 :    
502 :     fun opndToWord (T.LI i) = SOME (Word32.fromInt i)
503 :     | opndToWord (T.LI32 w) = SOME w
504 :     | opndToWord _ = NONE
505 :     in
506 :     case exp
507 :     of T.LI n => (loadImmed(n, zeroR, rd); rd)
508 :     | T.LI32 w => (loadImmed32(w, zeroR, rd); rd)
509 :     | T.LABEL le => (emit(I.LDA{r=rd, b=zeroR, d=I.LABop le}); rd)
510 :     | T.CONST c => (emit(I.LDA{r=rd, b=zeroR, d=I.CONSTop(c)}); rd)
511 :     | T.ADD(e, T.LABEL le)=>
512 :     (emit(I.LDA{r=rd, b=regAction(e), d=I.LABop(le)}); rd)
513 :     | T.ADD(T.LI i, e) => (loadImmed(i, regAction e, rd); rd)
514 :     | T.ADD(e, T.LI i) => (loadImmed(i, regAction e, rd); rd)
515 :     | T.ADD(T.LI32 i, e) => (loadImmed32(i, regAction e, rd); rd)
516 :     | T.ADD(e, T.LI32 i) => (loadImmed32(i, regAction e, rd); rd)
517 :     | T.ADD(T.CONST c, e) =>
518 :     (emit(I.LDA{r=rd, b=regAction e, d=I.CONSTop c}); rd)
519 :     | T.ADD(e, T.CONST c) =>
520 :     (emit(I.LDA{r=rd, b=regAction e, d=I.CONSTop c}); rd)
521 :     | T.ADD(e1, e2) => commOrderedArith(I.OPERATE, I.ADDL, e1, e2, T.LR)
522 :     | T.SUB(e1, e2, ord) => orderedArith(I.OPERATE, I.SUBL, e1, e2, ord)
523 :     | T.MULU(e1, e2) => commOrderedArith(I.OPERATE, I.MULL, e1, e2, T.LR)
524 :     | T.ADDT(e1, e2) =>
525 :     commOrderedArithTrap(I.OPERATEV, I.ADDLV, e1, e2, T.LR)
526 :     | T.SUBT(e1, e2, ord) => orderedArithTrap(I.OPERATEV, I.SUBLV, e1, e2, ord)
527 :     | T.MULT(e1, e2) => orderedMullTrap(e1, e2, T.LR, rd)
528 :     | T.ANDB(e1, e2) => let
529 :     fun opndToByteMask (SOME (0wx0:Word32.word)) = SOME 0xf
530 :     | opndToByteMask (SOME 0wx000000ff) = SOME 0xe
531 :     | opndToByteMask (SOME 0wx0000ff00) = SOME 0xd
532 :     | opndToByteMask (SOME 0wx0000ffff) = SOME 0xc
533 :     | opndToByteMask (SOME 0wx00ff0000) = SOME 0xb
534 :     | opndToByteMask (SOME 0wx00ff00ff) = SOME 0xa
535 :     | opndToByteMask (SOME 0wx00ffff00) = SOME 0x9
536 :     | opndToByteMask (SOME 0wx00ffffff) = SOME 0x8
537 :     | opndToByteMask (SOME 0wxff000000) = SOME 0x7
538 :     | opndToByteMask (SOME 0wxff0000ff) = SOME 0x6
539 :     | opndToByteMask (SOME 0wxff00ff00) = SOME 0x5
540 :     | opndToByteMask (SOME 0wxff00ffff) = SOME 0x4
541 :     | opndToByteMask (SOME 0wxffff0000) = SOME 0x3
542 :     | opndToByteMask (SOME 0wxffff00ff) = SOME 0x2
543 :     | opndToByteMask (SOME 0wxffffff00) = SOME 0x1
544 :     | opndToByteMask (SOME 0wxffffffff) = SOME 0x0
545 :     | opndToByteMask _ = NONE
546 :    
547 :     val opndToMask = opndToByteMask o opndToWord
548 :     in
549 :     case (opndToMask e1, opndToMask e2) of
550 :     (SOME mask, _) =>
551 :     orderedArith(I.OPERATE, I.ZAP, e2, T.LI mask, T.LR)
552 :     | (_, SOME mask) =>
553 :     orderedArith(I.OPERATE, I.ZAP, e1, T.LI mask, T.LR)
554 :     | _ => commOrderedArith(I.OPERATE, I.AND, e1, e2, T.LR)
555 :     end
556 :     | T.ORB(e1, e2) => commOrderedArith(I.OPERATE, I.BIS, e1, e2, T.LR)
557 :     | T.XORB(e1, e2) => commOrderedArith(I.OPERATE, I.XOR, e1, e2, T.LR)
558 :     | T.SLL(e1, e2, ord) =>
559 :     (case opndToWord e2 of
560 :     SOME 0w1 => let val r = T.REG (regAction e1)
561 :     in orderedArith(I.OPERATE, I.ADDL, r, r, T.LR) end
562 :     | SOME 0w2 => orderedArith(I.OPERATE, I.S4ADDL, e1, T.LI 0, T.LR)
563 :     | SOME 0w3 => orderedArith(I.OPERATE, I.S8ADDL, e1, T.LI 0, T.LR)
564 :     | _ => (orderedArith(I.OPERATE, I.SLL, e1, e2, T.LR);
565 :     emit(I.OPERATE{oper=I.SGNXL, ra=rd, rb=zeroOp, rc=rd});
566 :     rd)
567 :     (*esac*))
568 :     | T.SRA(e1, e2, ord) => let
569 :     val (reg, opnd) = (regAction e1, opndAction e2)
570 :     in
571 :     (* sign extend longword argument *)
572 :     emit(I.OPERATE{oper=I.SGNXL, ra=reg, rb=zeroOp, rc=reg});
573 :     emit(I.OPERATE{oper=I.SRA, ra=reg, rb=opnd, rc=rd});
574 :     rd
575 :     end
576 :     | T.SRL(e1, e2, ord) => let
577 :     val (reg, opnd) = (regAction e1, opndAction e2)
578 :     in
579 :     emit(I.OPERATE{oper=I.ZAP, ra=reg, rb=I.IMMop 0xf0, rc=reg});
580 :     emit(I.OPERATE{oper=I.SRL, ra=reg, rb=opnd, rc=rd});
581 :     rd
582 :     end
583 :     | T.DIVT arg => let
584 :     val (reg, opnd) = arithOperands arg
585 :     in
586 :     app emit (PseudoInstrs.divl({ra=reg, rb=opnd, rc=rd}, reduceOpnd));
587 :     rd
588 :     end
589 :     | T.DIVU arg => let
590 :     val (reg, opnd) = arithOperands arg
591 :     in
592 :     app emit (PseudoInstrs.divlu({ra=reg, rb=opnd, rc=rd}, reduceOpnd));
593 :     rd
594 :     end
595 :     | T.LOAD32(exp, region) => let
596 :     val (b, d) = eaAction exp
597 :     in emit(I.LOAD{ldOp=I.LDL, r=rd, b=b, d=d, mem=region}); rd
598 :     end
599 :     (* Load and sign-extend a byte from a non-aligned address *)
600 :     | T.LOAD8(exp, region) => let
601 :     val tmpR0 = newReg()
602 :     val tmpR1 = newReg()
603 :     val (rt, disp) = eaAction exp
604 :     in
605 :     emit(I.LOAD{ldOp=I.LDQ_U, r=tmpR0, b=rt, d=disp, mem=mem});
606 :     emit(I.LDA{r=tmpR1, b=rt, d=disp});
607 :     emit(I.OPERATE{oper=I.EXTBL, ra=tmpR0, rb=I.REGop tmpR1, rc=rd});
608 :     rd
609 :     end
610 :     | T.SEQ(e1, e2) => (stmAction e1; regAction e2)
611 :     | _ => error "regAction"
612 :     end (* regActionRd *)
613 :    
614 :     and eaAction exp = let
615 :     fun makeEA(r, n) =
616 :     if ~32768 <= n andalso n <= 32767 then (r, I.IMMop n)
617 :     else let
618 :     val tmpR = newReg()
619 :     val low = wtoi(Word.andb(itow n, 0w65535))(* unsigned low 16 bits *)
620 :     val high = n div 65536
621 :     val (lowsgn, highsgn) = (* Sign-extend *)
622 :     if low <= 32767 then (low, high) else (low -65536, high+1)
623 :     in
624 :     (emit(I.LDAH{r=tmpR, b=r, d=I.IMMop highsgn});
625 :     (tmpR, I.IMMop lowsgn))
626 :     end
627 :     in
628 :     case exp
629 :     of T.ADD(exp, T.LI n) => makeEA(regAction exp, n)
630 :     | T.ADD(T.LI n, exp) => makeEA(regAction exp, n)
631 :     | T.ADD(T.CONST c, exp) => (regAction exp, I.CONSTop(c))
632 :     | T.ADD(exp, T.CONST c) => (regAction exp, I.CONSTop(c))
633 :     | T.SUB(exp, T.LI n, _) => makeEA(regAction exp, ~n)
634 :     | exp => makeEA(regAction exp, 0)
635 :     end (* eaAction *)
636 :    
637 :     and fregAction (T.FREG f) = f
638 :     | fregAction exp = fregActionFd(exp, newFreg())
639 :    
640 :     and fregActionFd(exp, fd) = let
641 :     (* macho comment goes here *)
642 :     fun doFloatArith(farith, e1, e2, fd, order) = let
643 :     val (f1, f2) = orderedFArith(e1, e2, order)
644 :     in
645 :     emit(I.DEFFREG fd);
646 :     emit(I.FOPERATEV{oper=farith, fa=f1, fb=f2, fc=fd});
647 :     emit(I.TRAPB);
648 :     fd
649 :     end
650 :     in
651 :     case exp
652 :     of T.FREG f => f
653 :     | T.FADDD(e1, e2) => doFloatArith(I.ADDT, e1, e2, fd, T.LR)
654 :     | T.FMULD(e1, e2) => doFloatArith(I.MULT, e1, e2, fd, T.LR)
655 :     | T.FSUBD(e1, e2, ord) => doFloatArith(I.SUBT, e1, e2, fd, ord)
656 :     | T.FDIVD(e1, e2, ord) => doFloatArith(I.DIVT, e1, e2, fd, ord)
657 :     | T.FABSD exp =>
658 :     (emit(I.FOPERATE{oper=I.CPYS, fa=zeroR, fb=fregAction exp, fc=fd}); fd)
659 :     | T.FNEGD exp => let
660 :     val fs = fregAction exp
661 :     in
662 :     emit(I.FOPERATE{oper=I.CPYSN, fa=fs, fb=fs, fc=fd}); fd
663 :     end
664 :     | T.CVTI2D exp => let
665 :     val opnd = opndAction exp
666 :     in
667 :     app emit (PseudoInstrs.cvti2d({opnd=opnd, fd=fd},reduceOpnd));
668 :     fd
669 :     end
670 :     | T.LOADD(exp, region) => let
671 :     val (b, d) = eaAction exp
672 :     in emit(I.FLOAD{ldOp=I.LDT, r=fd, b=b, d=d, mem=region}); fd
673 :     end
674 :     | T.FSEQ(e1, e2) => (stmAction e1; fregAction e2)
675 :     end
676 :    
677 :     fun mltreeComp mltree = let
678 :     (* condition code registers are mapped onto general registers *)
679 :     fun cc (T.CCR(T.CC cc)) = T.GPR(T.REG cc)
680 :     | cc x = x
681 :    
682 :     fun mltc(T.PSEUDO_OP pOp) = F.pseudoOp pOp
683 :     | mltc(T.DEFINELABEL lab) = F.defineLabel lab
684 :     | mltc(T.ENTRYLABEL lab) = F.entryLabel lab
685 :     | mltc(T.ORDERED mlts) = F.ordered mlts
686 :     | mltc(T.BEGINCLUSTER) = F.beginCluster()
687 :     | mltc(T.CODE stms) = app stmAction stms
688 :     | mltc(T.ENDCLUSTER regmap)= F.endCluster regmap
689 :     | mltc(T.ESCAPEBLOCK regs) = F.exitBlock (map cc regs)
690 :     in mltc mltree
691 :     end
692 :    
693 :     val mlriscComp = stmAction
694 :     end
695 :    
696 :    
697 :     (*
698 :     * $Log: alpha32.sml,v $
699 :     * Revision 1.12 1998/02/16 13:57:41 george
700 :     * A register allocated temp is now associated with parallel COPYs
701 :     * instead of a dedicated register. The temp is used to break cycles.
702 :     *
703 :     * Revision 1.11 1997/09/17 20:00:54 george
704 :     * The AND operation with certain bit patterns is implemented using ZAP,
705 :     * and shifting by 1, 2, and 3, is implemented using ADDL, S4ADDL, and S8ADDL.
706 :     * -- ken cline
707 :     *
708 :     # Revision 1.10 1997/09/17 17:10:10 george
709 :     # added support for MLRisc ENTRYLABEL
710 :     #
711 :     # Revision 1.9 1997/09/12 10:12:54 george
712 :     # provided support for unsigned comparisons in sml/nj checklimit
713 :     #
714 :     # Revision 1.8 1997/08/30 02:09:35 george
715 :     # Changed cvti2d to take an I.operand as argument instead of a register
716 :     #
717 :     # Revision 1.7 1997/08/29 11:19:43 george
718 :     # removed code to generate div-by-zero traps
719 :     #
720 :     # Revision 1.6 1997/08/29 10:57:42 george
721 :     # div, divlu, and cvti2d operations are now implemented using pseudo-
722 :     # instructions instead of floating point registers. The latter approach
723 :     # allocated space on the stack which could conflict with spills.
724 :     #
725 :     # Revision 1.5 1997/08/27 17:49:22 george
726 :     # Addition of immediate constants is performed by LDO instead of ADD.
727 :     # THis allows for larger constants.
728 :     #
729 :     # Revision 1.4 1997/07/28 20:03:42 george
730 :     # Added support for regions
731 :     #
732 :     # Revision 1.3 1997/07/17 12:24:41 george
733 :     # The regmap is now represented as an int map rather than using arrays.
734 :     #
735 :     # Revision 1.2 1997/07/10 03:59:43 george
736 :     # Added translation for MLTree.ORDERED.
737 :     #
738 :     # Revision 1.1.1.1 1997/04/19 18:14:21 george
739 :     # Version 109.27
740 :     #
741 :     *)

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0