SCM Repository
Annotation of /sml/branches/SMLNJ/src/MLRISC/alpha32/alpha32.sml
Parent Directory
|
Revision Log
Revision 123 - (view) (download)
1 : | monnier | 16 | (* alpha32.sml |
2 : | * | ||
3 : | * COPYRIGHT (c) 1995 AT&T Bell Laboratories. | ||
4 : | * | ||
5 : | * generates machine code from the mltree. | ||
6 : | * | ||
7 : | *) | ||
8 : | |||
9 : | (** NOTE: f29 and f30 are used as temporaries. | ||
10 : | ** r31 is always zero. | ||
11 : | **) | ||
12 : | functor Alpha32 | ||
13 : | (structure Alpha32Instr : ALPHA32INSTR | ||
14 : | structure Alpha32MLTree : MLTREE | ||
15 : | structure Flowgen : FLOWGRAPH_GEN | ||
16 : | structure PseudoInstrs : ALPHA32_PSEUDO_INSTR | ||
17 : | sharing Alpha32Instr.Region = Alpha32MLTree.Region | ||
18 : | sharing Flowgen.I = PseudoInstrs.I = Alpha32Instr | ||
19 : | sharing Flowgen.T=Alpha32MLTree | ||
20 : | sharing Alpha32MLTree.Constant = Alpha32Instr.Constant) : MLTREECOMP = | ||
21 : | struct | ||
22 : | structure F = Flowgen | ||
23 : | structure T = Alpha32MLTree | ||
24 : | structure R = Alpha32MLTree.Region | ||
25 : | structure I = Alpha32Instr | ||
26 : | structure C = Alpha32Instr.C | ||
27 : | structure LE = LabelExp | ||
28 : | structure W32 = Word32 | ||
29 : | monnier | 106 | |
30 : | monnier | 16 | (********************************************************* |
31 : | |||
32 : | Trap Shadows, Floating Exceptions, and Denormalized | ||
33 : | Numbers on the DEC Alpha | ||
34 : | |||
35 : | Andrew W. Appel and Lal George | ||
36 : | Nov 28, 1995 | ||
37 : | |||
38 : | See section 4.7.5.1 of the Alpha Architecture Reference Manual. | ||
39 : | |||
40 : | The Alpha has imprecise exceptions, meaning that if a floating | ||
41 : | point instruction raises an IEEE exception, the exception may | ||
42 : | not interrupt the processor until several successive instructions have | ||
43 : | completed. ML, on the other hand, may want a "precise" model | ||
44 : | of floating point exceptions. | ||
45 : | |||
46 : | Furthermore, the Alpha hardware does not support denormalized numbers | ||
47 : | (for "gradual underflow"). Instead, underflow always rounds to zero. | ||
48 : | However, each floating operation (add, mult, etc.) has a trapping | ||
49 : | variant that will raise an exception (imprecisely, of course) on | ||
50 : | underflow; in that case, the instruction will produce a zero result | ||
51 : | AND an exception will occur. In fact, there are several variants | ||
52 : | of each instruction; three variants of MULT are: | ||
53 : | |||
54 : | MULT s1,s2,d truncate denormalized result to zero; no exception | ||
55 : | MULT/U s1,s2,d truncate denormalized result to zero; raise UNDERFLOW | ||
56 : | MULT/SU s1,s2,d software completion, producing denormalized result | ||
57 : | |||
58 : | The hardware treats the MULT/U and MULT/SU instructions identically, | ||
59 : | truncating a denormalized result to zero and raising the UNDERFLOW | ||
60 : | exception. But the operating system, on an UNDERFLOW exception, | ||
61 : | examines the faulting instruction to see if it's an /SU form, and if so, | ||
62 : | recalculates s1*s2, puts the right answer in d, and continues, | ||
63 : | all without invoking the user's signal handler. | ||
64 : | |||
65 : | Because most machines compute with denormalized numbers in hardware, | ||
66 : | to maximize portability of SML programs, we use the MULT/SU form. | ||
67 : | (and ADD/SU, SUB/SU, etc.) But to use this form successfully, | ||
68 : | certain rules have to be followed. Basically, d cannot be the same | ||
69 : | register as s1 or s2, because the opsys needs to be able to | ||
70 : | recalculate the operation using the original contents of s1 and s2, | ||
71 : | and the MULT/SU instruction will overwrite d even if it traps. | ||
72 : | |||
73 : | More generally, we may want to have a sequence of floating-point | ||
74 : | instructions. The rules for such a sequence are: | ||
75 : | |||
76 : | 1. The sequence should end with a TRAPB (trap barrier) instruction. | ||
77 : | (This could be relaxed somewhat, but certainly a TRAPB would | ||
78 : | be a good idea sometime before the next branch instruction or | ||
79 : | update of an ML reference variable, or any other ML side effect.) | ||
80 : | 2. No instruction in the sequence should destroy any operand of itself | ||
81 : | or of any previous instruction in the sequence. | ||
82 : | 3. No two instructions in the sequence should write the same destination | ||
83 : | register. | ||
84 : | |||
85 : | We can achieve these conditions by the following trick in the | ||
86 : | Alpha code generator. Each instruction in the sequence will write | ||
87 : | to a different temporary; this is guaranteed by the translation from | ||
88 : | ML-RISC. At the beginning of the sequence, we will put a special | ||
89 : | pseudo-instruction (we call it DEFFREG) that "defines" the destination | ||
90 : | register of the arithmetic instruction. If there are K arithmetic | ||
91 : | instructions in the sequence, then we'll insert K DEFFREG instructions | ||
92 : | all at the beginning of the sequence. | ||
93 : | Then, each arithop will not only "define" its destination temporary | ||
94 : | but will "use" it as well. When all these instructions are fed to | ||
95 : | the liveness analyzer, the resulting interference graph will then | ||
96 : | have inteference edges satisfying conditions 2 and 3 above. | ||
97 : | |||
98 : | Of course, DEFFREG doesn't actually generate any code. In our model | ||
99 : | of the Alpha, every instruction generates exactly 4 bytes of code | ||
100 : | except the "span-dependent" ones. Therefore, we'll specify DEFFREG | ||
101 : | as a span-dependent instruction whose minimum and maximum sizes are zero. | ||
102 : | |||
103 : | At the moment, we do not group arithmetic operations into sequences; | ||
104 : | that is, each arithop will be preceded by a single DEFFREG and | ||
105 : | followed by a TRAPB. To avoid the cost of all those TRAPB's, we | ||
106 : | should improve this when we have time. Warning: Don't put more | ||
107 : | than 31 instructions in the sequence, because they're all required | ||
108 : | to write to different destination registers! | ||
109 : | |||
110 : | What about multiple traps? For example, suppose a sequence of | ||
111 : | instructions produces an Overflow and a Divide-by-Zero exception? | ||
112 : | ML would like to know only about the earliest trap, but the hardware | ||
113 : | will report BOTH traps to the operating system. However, as long | ||
114 : | as the rules above are followed (and the software-completion versions | ||
115 : | of the arithmetic instructions are used), the operating system will | ||
116 : | have enough information to know which instruction produced the | ||
117 : | trap. It is very probable that the operating system will report ONLY | ||
118 : | the earlier trap to the user process, but I'm not sure. | ||
119 : | |||
120 : | For a hint about what the operating system is doing in its own | ||
121 : | trap-handler (with software completion), see section 6.3.2 of | ||
122 : | "OpenVMS Alpha Software" (Part II of the Alpha Architecture | ||
123 : | Manual). This stuff should apply to Unix (OSF1) as well as VMS. | ||
124 : | |||
125 : | ****************************************************************) | ||
126 : | |||
127 : | fun error msg = MLRiscErrorMsg.impossible ("Alpha32." ^ msg) | ||
128 : | |||
129 : | val itow = Word.fromInt | ||
130 : | val wtoi = Word.toIntX | ||
131 : | |||
132 : | val emit = F.emitInstr | ||
133 : | |||
134 : | val zeroR = 31 | ||
135 : | val zeroOp = I.REGop zeroR | ||
136 : | val zeroEA = I.Direct zeroR | ||
137 : | val argReg0 = 16 | ||
138 : | |||
139 : | val mem = R.memory | ||
140 : | |||
141 : | fun cond T.LT = I.CC_LT | cond T.LTU = I.CC_LTU | ||
142 : | | cond T.LE = I.CC_LE | cond T.LEU = I.CC_LEU | ||
143 : | | cond T.EQ = I.CC_EQ | cond T.NEQ = I.CC_NEQ | ||
144 : | | cond T.GE = I.CC_GE | cond T.GEU = I.CC_GEU | ||
145 : | | cond T.GT = I.CC_GT | cond T.GTU = I.CC_GTU | ||
146 : | |||
147 : | fun swapcc I.CC_LT = I.CC_GT | swapcc I.CC_LTU = I.CC_GTU | ||
148 : | | swapcc I.CC_LE = I.CC_GE | swapcc I.CC_LEU = I.CC_GEU | ||
149 : | | swapcc I.CC_EQ = I.CC_EQ | swapcc I.CC_NEQ = I.CC_NEQ | ||
150 : | | swapcc I.CC_GT = I.CC_LT | swapcc I.CC_GEU = I.CC_LEU | ||
151 : | | swapcc I.CC_GE = I.CC_LE | swapcc _ = error "swapcc" | ||
152 : | |||
153 : | (* NOTE: stack allocation must be multiples of 16 *) | ||
154 : | fun stackAllocate(n) = | ||
155 : | I.OPERATE{oper=I.SUBL, ra=C.stackptrR, rb=I.IMMop n, rc=C.stackptrR} | ||
156 : | |||
157 : | fun stackDeallocate(n) = | ||
158 : | I.OPERATE{oper=I.ADDL, ra=C.stackptrR, rb=I.IMMop n, rc=C.stackptrR} | ||
159 : | |||
160 : | fun loadImmed(n, base, rd) = | ||
161 : | if ~32768 <= n andalso n < 32768 then | ||
162 : | emit(I.LDA{r=rd, b=base, d=I.IMMop n}) | ||
163 : | else let | ||
164 : | val w = itow n | ||
165 : | val hi = Word.~>>(w, 0w16) | ||
166 : | val lo = Word.andb(w, 0w65535) | ||
167 : | val (hi', lo') = | ||
168 : | if lo < 0w32768 then (hi, lo) else (hi+0w1, lo-0w65536) | ||
169 : | in | ||
170 : | emit(I.LDA{r=rd, b=base, d=I.IMMop(wtoi lo')}); | ||
171 : | emit(I.LDAH{r=rd, b=rd, d=I.IMMop(wtoi hi')}) | ||
172 : | end | ||
173 : | |||
174 : | (* loadImmed32 is used to load int32 and word32 constants. | ||
175 : | * In either case we sign extend the 32-bit value. This is compatible | ||
176 : | * with LDL which sign extends a 32-bit valued memory location. | ||
177 : | *) | ||
178 : | monnier | 123 | fun loadImmed32(0w0, base, rd) = |
179 : | emit(I.OPERATE{oper=I.ADDL, ra=base, rb=zeroOp, rc=rd}) | ||
180 : | | loadImmed32(n, base, rd) = let | ||
181 : | val low = W32.andb(n, 0w65535) (* unsigned (0 .. 65535) *) | ||
182 : | val high = W32.~>>(n, 0w16) (* signed (~32768 .. 32768] *) | ||
183 : | fun loadimmed(0, high) = emit(I.LDAH{r=rd, b=base, d=I.IMMop(high)}) | ||
184 : | | loadimmed(low, high) = | ||
185 : | (emit(I.LDA{r=rd, b=base, d=I.IMMop(low)}); | ||
186 : | emit(I.LDAH{r=rd, b=rd, d=I.IMMop(high)})) | ||
187 : | in | ||
188 : | if W32.<(low, 0w32768) then loadimmed(W32.toInt low, W32.toIntX high) | ||
189 : | else let (* low = (32768 .. 65535) *) | ||
190 : | val lowsgn = W32.-(low, 0w65536) (* signed (~1 .. ~32768) *) | ||
191 : | val highsgn = W32.+(high, 0w1) (* (~32768 .. 32768) *) | ||
192 : | val ilow = W32.toIntX lowsgn | ||
193 : | val ihigh = W32.toIntX highsgn | ||
194 : | in | ||
195 : | if ihigh <> 32768 then loadimmed(ilow, ihigh) | ||
196 : | else let | ||
197 : | val tmpR = C.newReg() | ||
198 : | in | ||
199 : | (* you gotta do what you gotta do! *) | ||
200 : | emit(I.LDA{r=rd, b=base, d=I.IMMop(ilow)}); | ||
201 : | emit(I.OPERATE{oper=I.ADDL, ra=zeroR, rb=I.IMMop 1, rc=tmpR}); | ||
202 : | emit(I.OPERATE{oper=I.SLL, ra=tmpR, rb=I.IMMop 31, rc=tmpR}); | ||
203 : | emit(I.OPERATE{oper=I.ADDL, ra=tmpR, rb=I.REGop rd, rc=rd}) | ||
204 : | end | ||
205 : | end | ||
206 : | end | ||
207 : | monnier | 16 | |
208 : | fun orderedFArith (exp1, exp2, T.LR) = (fregAction exp1, fregAction exp2) | ||
209 : | | orderedFArith (exp1, exp2, T.RL) = let | ||
210 : | val f2 = fregAction exp2 | ||
211 : | in (fregAction exp1, f2) | ||
212 : | end | ||
213 : | |||
214 : | and stmAction exp = let | ||
215 : | fun fbranch(_, T.FCMP(cc, exp1, exp2, order), lab) = let | ||
216 : | val (f1, f2) = orderedFArith(exp1, exp2, order) | ||
217 : | fun bcc(cmp, br) = let | ||
218 : | monnier | 106 | val tmpR = C.newFreg() |
219 : | monnier | 16 | in |
220 : | emit(I.DEFFREG(tmpR)); | ||
221 : | emit(I.FOPERATE{oper=cmp, fa=f1, fb=f2, fc=tmpR}); | ||
222 : | emit(I.TRAPB); | ||
223 : | emit(I.FBRANCH(br, tmpR, lab)) | ||
224 : | end | ||
225 : | |||
226 : | fun fall(cmp1, br1, cmp2, br2) = let | ||
227 : | monnier | 106 | val tmpR1 = C.newFreg() |
228 : | val tmpR2 = C.newFreg() | ||
229 : | monnier | 16 | val fallLab = Label.newLabel "" |
230 : | in | ||
231 : | emit(I.DEFFREG(tmpR1)); | ||
232 : | emit(I.FOPERATE{oper=cmp1, fa=f1, fb=f2, fc=tmpR1}); | ||
233 : | emit(I.TRAPB); | ||
234 : | emit(I.FBRANCH(br1, tmpR1, fallLab)); | ||
235 : | emit(I.DEFFREG(tmpR2)); | ||
236 : | emit(I.FOPERATE{oper=cmp2, fa=f1, fb=f2, fc=tmpR2}); | ||
237 : | emit(I.TRAPB); | ||
238 : | emit(I.FBRANCH(br2, tmpR2, lab)); | ||
239 : | F.defineLabel fallLab | ||
240 : | end | ||
241 : | |||
242 : | fun bcc2(cmp1, br1, cmp2, br2) = (bcc(cmp1, br1); bcc(cmp2, br2)) | ||
243 : | in | ||
244 : | case cc | ||
245 : | of T.== => bcc(I.CMPTEQ, I.BNE) | ||
246 : | | T.?<> => bcc(I.CMPTEQ, I.BEQ) | ||
247 : | | T.? => bcc(I.CMPTUN, I.BNE) | ||
248 : | | T.<=> => bcc(I.CMPTUN, I.BEQ) | ||
249 : | | T.> => fall(I.CMPTLE, I.BNE, I.CMPTUN, I.BEQ) | ||
250 : | | T.>= => fall(I.CMPTLT, I.BNE, I.CMPTUN, I.BEQ) | ||
251 : | | T.?> => bcc(I.CMPTLE, I.BEQ) | ||
252 : | | T.?>= => bcc(I.CMPTLT, I.BEQ) | ||
253 : | | T.< => bcc(I.CMPTLT, I.BNE) | ||
254 : | | T.<= => bcc(I.CMPTLE, I.BNE) | ||
255 : | | T.?< => bcc2(I.CMPTLT, I.BNE, I.CMPTUN, I.BNE) | ||
256 : | | T.?<= => bcc2(I.CMPTLE, I.BNE, I.CMPTUN, I.BNE) | ||
257 : | | T.<> => fall(I.CMPTEQ, I.BNE, I.CMPTUN, I.BEQ) | ||
258 : | | T.?= => bcc2(I.CMPTEQ, I.BNE, I.CMPTUN, I.BNE) | ||
259 : | end | ||
260 : | |||
261 : | fun branch(cond, exp1, exp2, lab, order) = let | ||
262 : | fun zapHi r = emit(I.OPERATE{oper=I.ZAP, ra=r, rb=I.IMMop 0xf0, rc=r}) | ||
263 : | monnier | 106 | val tmpR = C.newReg() |
264 : | monnier | 16 | val (r1, o2) = |
265 : | case order | ||
266 : | of T.LR => (regAction exp1, opndAction exp2) | ||
267 : | | T.RL => let val o2' = opndAction exp2 | ||
268 : | in | ||
269 : | (regAction(exp1), o2') | ||
270 : | end | ||
271 : | fun emitBr(cmp, br) = | ||
272 : | (emit(I.OPERATE{oper=cmp, ra=r1, rb=o2, rc=tmpR}); | ||
273 : | emit(I.BRANCH(br, tmpR, lab))) | ||
274 : | fun emitUnsignedBr(cmp, br) = | ||
275 : | (case (r1, o2) | ||
276 : | of (r1, I.REGop r2) => (zapHi r1; zapHi r2; emitBr(cmp, br)) | ||
277 : | | (r1, o2) => (zapHi r1; emitBr(cmp, br)) | ||
278 : | (*esac*)) | ||
279 : | in | ||
280 : | case cond | ||
281 : | of I.CC_LTU => emitUnsignedBr(I.CMPULT, I.BNE) | ||
282 : | | I.CC_LEU => emitUnsignedBr(I.CMPULE, I.BNE) | ||
283 : | | I.CC_GTU => emitUnsignedBr(I.CMPULE, I.BEQ) | ||
284 : | | I.CC_GEU => emitUnsignedBr(I.CMPULT, I.BEQ) | ||
285 : | | I.CC_LT => emitBr(I.CMPLT, I.BNE) | ||
286 : | | I.CC_LE => emitBr(I.CMPLE, I.BNE) | ||
287 : | | I.CC_GT => emitBr(I.CMPLE, I.BEQ) | ||
288 : | | I.CC_GE => emitBr(I.CMPLT, I.BEQ) | ||
289 : | | I.CC_EQ => emitBr(I.CMPEQ, I.BNE) | ||
290 : | | I.CC_NEQ => emitBr(I.CMPEQ, I.BEQ) | ||
291 : | end | ||
292 : | monnier | 106 | fun copyTmp() = SOME(I.Direct(C.newReg())) |
293 : | fun fcopyTmp() = SOME(I.FDirect(C.newFreg())) | ||
294 : | monnier | 16 | in |
295 : | case exp | ||
296 : | of T.JMP(T.LABEL(LE.LABEL lab), _) => emit(I.BRANCH(I.BR, zeroR, lab)) | ||
297 : | | T.JMP(T.LABEL _, _) => error "JMP(T.LABEL _, _)" | ||
298 : | | T.JMP(exp, labs) => | ||
299 : | emit(I.JMPL({r=C.asmTmpR, b=regAction exp, d=0}, labs)) | ||
300 : | | T.BCC(_, T.CMP(T.NEQ, T.ANDB(exp, T.LI 1), T.LI 0, _), lab) => | ||
301 : | emit(I.BRANCH(I.BLBS, regAction exp, lab)) | ||
302 : | | T.BCC(_, T.CMP(cc, exp, T.LI n, ord), lab) => | ||
303 : | branch(cond cc, exp, T.LI n, lab, ord) | ||
304 : | | T.BCC(_, T.CMP(cc, T.LI n, exp, ord), lab) => | ||
305 : | branch(swapcc(cond cc), exp, T.LI n, lab, ord) | ||
306 : | | T.BCC(_, T.CMP(cc, e1, e2, ord), lab) => | ||
307 : | branch(cond cc, e1, e2, lab, ord) | ||
308 : | | T.BCC(_, e, lab) => emit(I.BRANCH(I.BNE, ccAction e, lab)) | ||
309 : | | T.FBCC arg => fbranch arg | ||
310 : | | T.CALL(exp, def, use) => let | ||
311 : | val pv = regAction exp | ||
312 : | val returnPtrR = 26 | ||
313 : | fun live([],acc) = acc | ||
314 : | | live(T.GPR(T.REG r)::regs,acc) = live(regs, C.addReg(r,acc)) | ||
315 : | | live(T.CCR(T.CC cc)::regs,acc) = live(regs, C.addReg(cc,acc)) | ||
316 : | | live(T.FPR(T.FREG f)::regs,acc) = live(regs, C.addFreg(f,acc)) | ||
317 : | | live(_::regs, acc) = live(regs, acc) | ||
318 : | in | ||
319 : | emit(I.JSR({r=returnPtrR, b=pv, d=0}, | ||
320 : | live(def, C.addReg(returnPtrR, C.empty)), | ||
321 : | live(use, C.addReg(pv, C.empty)))) | ||
322 : | end | ||
323 : | | T.RET => emit(I.JMPL({r=zeroR, b=26, d=0}, [])) | ||
324 : | | T.STORE8(ea, r, region) => let | ||
325 : | val rs = regAction r | ||
326 : | val (rd, disp) = eaAction ea | ||
327 : | monnier | 106 | val t1 = C.newReg() |
328 : | val t2 = C.newReg() | ||
329 : | val t3 = C.newReg() | ||
330 : | monnier | 16 | in |
331 : | app emit | ||
332 : | [I.LOAD{ldOp=I.LDQ_U, r=t1, b=rd, d=disp,mem=mem}, | ||
333 : | I.LDA{r=t2, b=rd, d=disp}, | ||
334 : | I.OPERATE{oper=I.INSBL, ra=rs, rb=I.REGop(t2), rc=t3}, | ||
335 : | I.OPERATE{oper=I.MSKBL, ra=t1, rb=I.REGop(t2), rc=t1}, | ||
336 : | I.OPERATE{oper=I.BIS, ra=t1, rb=I.REGop(t3), rc=t1}, | ||
337 : | I.STORE{stOp=I.STQ_U, r=t1, b=rd, d=disp, mem=mem}] | ||
338 : | end | ||
339 : | | T.STORE32(ea, r, region) => let | ||
340 : | val (b, d) = eaAction ea | ||
341 : | in emit(I.STORE{stOp=I.STL, r=regAction r, b=b, d=d, mem=region}) | ||
342 : | end | ||
343 : | | T.STORED(ea, f, region) => let | ||
344 : | val (b, d) = eaAction ea | ||
345 : | in emit(I.FSTORE{stOp=I.STT, r=fregAction f, b=b, d=d, mem=region}) | ||
346 : | end | ||
347 : | | T.STORECC(ea, cc, region) => error "stmAction.STORECC" | ||
348 : | | T.MV(rd, exp) => let | ||
349 : | fun move(dst, src) = I.OPERATE{oper=I.BIS, ra=src, rb=zeroOp, rc=dst} | ||
350 : | in | ||
351 : | case exp | ||
352 : | of T.REG(rs) => if rs = rd then () else emit(move(rd, rs)) | ||
353 : | | T.LI n => loadImmed(n, zeroR, rd) | ||
354 : | | T.LI32 w => loadImmed32(w, zeroR, rd) | ||
355 : | | _ => let val rs = regActionRd(exp, rd) | ||
356 : | in if rs = rd then () else emit(move(rd,rs)) | ||
357 : | end | ||
358 : | (*esac*) | ||
359 : | end | ||
360 : | | T.CCMV(cd, exp) => let | ||
361 : | val cs = case exp of T.CC(r) => r | _ => ccActionCd(exp, cd) | ||
362 : | in | ||
363 : | if cs = cd then () | ||
364 : | else emit(I.OPERATE{oper=I.BIS, ra=cs, rb=zeroOp, rc=cd}) | ||
365 : | end | ||
366 : | | T.FMV(fd, exp) => let | ||
367 : | fun fmove(dst, src) = I.FOPERATE{oper=I.CPYS, fa=src, fb=src, fc=dst} | ||
368 : | in | ||
369 : | case exp | ||
370 : | of T.FREG(fs) => | ||
371 : | if fs = fd then () else emit(fmove(fd, fs)) | ||
372 : | | _ => let | ||
373 : | val fs = fregActionFd(exp, fd) | ||
374 : | in if fs = fd then () else emit(fmove(fd, fs)) | ||
375 : | end | ||
376 : | (*esac*) | ||
377 : | end | ||
378 : | | T.COPY(rds as [_], rss) => | ||
379 : | emit(I.COPY{dst=rds, src=rss, impl=ref NONE, tmp=NONE}) | ||
380 : | | T.COPY(rds, rss) => | ||
381 : | emit(I.COPY{dst=rds, src=rss, impl=ref NONE, tmp=copyTmp()}) | ||
382 : | | T.FCOPY(fds as [_], fss)=> | ||
383 : | emit(I.FCOPY{dst=fds, src=fss, impl=ref NONE, tmp=NONE}) | ||
384 : | | T.FCOPY(fds, fss) => | ||
385 : | emit(I.FCOPY{dst=fds, src=fss, impl=ref NONE, tmp=fcopyTmp()}) | ||
386 : | end | ||
387 : | and ccAction(T.CC r) = r | ||
388 : | | ccAction(e) = ccActionCd(e, C.newCCreg()) | ||
389 : | |||
390 : | and ccActionCd(T.CC r, _) = r | ||
391 : | (* enough for now ... *) | ||
392 : | | ccActionCd(T.CMP(T.GTU, e1, e2, ord), cd) = let | ||
393 : | val (opnd, reg) = | ||
394 : | case ord | ||
395 : | of T.LR => (opndAction e1, regAction e2) | ||
396 : | | T.RL => let | ||
397 : | val r = regAction e2 | ||
398 : | in (opndAction e1, r) | ||
399 : | end | ||
400 : | in | ||
401 : | emit(I.OPERATE{oper=I.CMPULT, ra=reg, rb=opnd, rc=cd}); | ||
402 : | cd | ||
403 : | end | ||
404 : | | ccActionCd(T.LOADCC _, _) = error "ccAction" | ||
405 : | |||
406 : | and opndAction (T.LI value) = | ||
407 : | if value <= 255 andalso value >= 0 then I.IMMop value | ||
408 : | else let | ||
409 : | monnier | 106 | val tmpR = C.newReg() |
410 : | monnier | 16 | in |
411 : | loadImmed (value, zeroR, tmpR); | ||
412 : | I.REGop tmpR | ||
413 : | end | ||
414 : | | opndAction(T.LI32 value) = | ||
415 : | if Word32.<=(value, 0w255) then I.IMMop (Word32.toInt value) | ||
416 : | else let | ||
417 : | monnier | 106 | val tmpR = C.newReg () |
418 : | monnier | 16 | in |
419 : | loadImmed32 (value, zeroR, tmpR); | ||
420 : | I.REGop tmpR | ||
421 : | end | ||
422 : | | opndAction(T.CONST const) = I.CONSTop (const) | ||
423 : | | opndAction exp = I.REGop (regAction exp) | ||
424 : | |||
425 : | and reduceOpnd(I.IMMop i) = let | ||
426 : | monnier | 106 | val rd = C.newReg() |
427 : | monnier | 16 | in loadImmed(i, zeroR, rd); rd |
428 : | end | ||
429 : | | reduceOpnd(I.REGop rd) = rd | ||
430 : | | reduceOpnd(I.CONSTop c) = regAction(T.CONST c) | ||
431 : | | reduceOpnd(I.LABop le) = regAction(T.LABEL le) | ||
432 : | | reduceOpnd _ = error "reduceOpnd" | ||
433 : | |||
434 : | and orderedRR(e1, e2, T.LR) = (regAction e1, regAction e2) | ||
435 : | | orderedRR(e1, e2, T.RL) = let | ||
436 : | val r2 = regAction e2 | ||
437 : | in (regAction e1, r2) | ||
438 : | end | ||
439 : | |||
440 : | and regAction (T.REG r) = r | ||
441 : | monnier | 106 | | regAction exp = regActionRd(exp, C.newReg()) |
442 : | monnier | 16 | |
443 : | and arithOperands(e1, e2, T.LR) = (regAction e1, opndAction e2) | ||
444 : | | arithOperands(e1, e2, T.RL) = let | ||
445 : | val opnd = opndAction e2 | ||
446 : | in (regAction e1, opnd) | ||
447 : | end | ||
448 : | |||
449 : | and regActionRd(exp, rd) = let | ||
450 : | |||
451 : | fun orderedArith(oper, arith, e1, e2, ord) = let | ||
452 : | val (reg, opnd) = arithOperands(e1, e2, ord) | ||
453 : | in | ||
454 : | emit(oper{oper=arith, ra=reg, rb=opnd, rc=rd}); | ||
455 : | rd | ||
456 : | end | ||
457 : | |||
458 : | fun commOrderedArith(oper, arith, e1 ,e2, ord) = let | ||
459 : | fun f(e1 as T.LI _, e2) = orderedArith(oper, arith, e2, e1, ord) | ||
460 : | | f(e1 as T.LI32 _, e2) = orderedArith(oper, arith, e2, e1, ord) | ||
461 : | | f(e1, e2) = orderedArith(oper, arith, e1, e2, ord) | ||
462 : | in | ||
463 : | f(e1, e2) | ||
464 : | end | ||
465 : | |||
466 : | |||
467 : | fun orderedArithTrap arg = orderedArith arg before emit(I.TRAPB) | ||
468 : | fun commOrderedArithTrap arg = commOrderedArith arg before emit(I.TRAPB) | ||
469 : | |||
470 : | fun orderedMullTrap (e1, e2, ord, rd) = let | ||
471 : | val (reg, opnd) = case ord | ||
472 : | of T.LR => (regAction e1, opndAction e2) | ||
473 : | | T.RL => let | ||
474 : | val opnd = opndAction e2 | ||
475 : | in | ||
476 : | (regAction e1, opnd) | ||
477 : | end | ||
478 : | |||
479 : | fun emitMulvImmed (reg, 0, rd) = | ||
480 : | emit(I.LDA{r=rd, b=zeroR, d=I.IMMop 0}) | ||
481 : | | emitMulvImmed (reg, 1, rd) = | ||
482 : | emit(I.OPERATE{oper=I.ADDL, ra=reg, rb=zeroOp, rc=rd}) | ||
483 : | | emitMulvImmed (reg, multiplier, rd) = let | ||
484 : | fun log2 0w1 = 0 | log2 n = 1 + (log2 (Word.>> (n, 0w1))) | ||
485 : | |||
486 : | fun exp2 n = Word.<<(0w1, n) | ||
487 : | |||
488 : | fun bitIsSet (x,n) = Word.andb(x,exp2 n) <> 0w0 | ||
489 : | |||
490 : | fun loop (~1) = () | ||
491 : | | loop n = | ||
492 : | (if bitIsSet(itow multiplier, itow n) then | ||
493 : | emit(I.OPERATEV{oper=I.ADDLV, ra=reg, rb=I.REGop rd, rc=rd}) | ||
494 : | else (); | ||
495 : | if n>0 then | ||
496 : | emit(I.OPERATEV{oper=I.ADDLV, ra=rd, rb=I.REGop rd, rc=rd}) | ||
497 : | else (); | ||
498 : | loop (n-1)) | ||
499 : | in | ||
500 : | emit(I.OPERATEV{oper=I.ADDLV, ra=reg, rb=I.REGop reg, rc=rd}); | ||
501 : | loop ((log2 (itow multiplier)) - 1) | ||
502 : | end | ||
503 : | in | ||
504 : | case opnd | ||
505 : | of (I.IMMop multiplier) => emitMulvImmed (reg, multiplier, rd) | ||
506 : | | _ => emit (I.OPERATEV{oper=I.MULLV , ra=reg, rb=opnd, rc=rd}) | ||
507 : | (*esac*); | ||
508 : | emit(I.TRAPB); | ||
509 : | rd | ||
510 : | end | ||
511 : | |||
512 : | fun opndToWord (T.LI i) = SOME (Word32.fromInt i) | ||
513 : | | opndToWord (T.LI32 w) = SOME w | ||
514 : | | opndToWord _ = NONE | ||
515 : | in | ||
516 : | case exp | ||
517 : | of T.LI n => (loadImmed(n, zeroR, rd); rd) | ||
518 : | | T.LI32 w => (loadImmed32(w, zeroR, rd); rd) | ||
519 : | | T.LABEL le => (emit(I.LDA{r=rd, b=zeroR, d=I.LABop le}); rd) | ||
520 : | | T.CONST c => (emit(I.LDA{r=rd, b=zeroR, d=I.CONSTop(c)}); rd) | ||
521 : | | T.ADD(e, T.LABEL le)=> | ||
522 : | (emit(I.LDA{r=rd, b=regAction(e), d=I.LABop(le)}); rd) | ||
523 : | | T.ADD(T.LI i, e) => (loadImmed(i, regAction e, rd); rd) | ||
524 : | | T.ADD(e, T.LI i) => (loadImmed(i, regAction e, rd); rd) | ||
525 : | | T.ADD(T.LI32 i, e) => (loadImmed32(i, regAction e, rd); rd) | ||
526 : | | T.ADD(e, T.LI32 i) => (loadImmed32(i, regAction e, rd); rd) | ||
527 : | | T.ADD(T.CONST c, e) => | ||
528 : | (emit(I.LDA{r=rd, b=regAction e, d=I.CONSTop c}); rd) | ||
529 : | | T.ADD(e, T.CONST c) => | ||
530 : | (emit(I.LDA{r=rd, b=regAction e, d=I.CONSTop c}); rd) | ||
531 : | | T.ADD(e1, e2) => commOrderedArith(I.OPERATE, I.ADDL, e1, e2, T.LR) | ||
532 : | | T.SUB(e1, e2, ord) => orderedArith(I.OPERATE, I.SUBL, e1, e2, ord) | ||
533 : | | T.MULU(e1, e2) => commOrderedArith(I.OPERATE, I.MULL, e1, e2, T.LR) | ||
534 : | | T.ADDT(e1, e2) => | ||
535 : | commOrderedArithTrap(I.OPERATEV, I.ADDLV, e1, e2, T.LR) | ||
536 : | | T.SUBT(e1, e2, ord) => orderedArithTrap(I.OPERATEV, I.SUBLV, e1, e2, ord) | ||
537 : | | T.MULT(e1, e2) => orderedMullTrap(e1, e2, T.LR, rd) | ||
538 : | | T.ANDB(e1, e2) => let | ||
539 : | fun opndToByteMask (SOME (0wx0:Word32.word)) = SOME 0xf | ||
540 : | | opndToByteMask (SOME 0wx000000ff) = SOME 0xe | ||
541 : | | opndToByteMask (SOME 0wx0000ff00) = SOME 0xd | ||
542 : | | opndToByteMask (SOME 0wx0000ffff) = SOME 0xc | ||
543 : | | opndToByteMask (SOME 0wx00ff0000) = SOME 0xb | ||
544 : | | opndToByteMask (SOME 0wx00ff00ff) = SOME 0xa | ||
545 : | | opndToByteMask (SOME 0wx00ffff00) = SOME 0x9 | ||
546 : | | opndToByteMask (SOME 0wx00ffffff) = SOME 0x8 | ||
547 : | | opndToByteMask (SOME 0wxff000000) = SOME 0x7 | ||
548 : | | opndToByteMask (SOME 0wxff0000ff) = SOME 0x6 | ||
549 : | | opndToByteMask (SOME 0wxff00ff00) = SOME 0x5 | ||
550 : | | opndToByteMask (SOME 0wxff00ffff) = SOME 0x4 | ||
551 : | | opndToByteMask (SOME 0wxffff0000) = SOME 0x3 | ||
552 : | | opndToByteMask (SOME 0wxffff00ff) = SOME 0x2 | ||
553 : | | opndToByteMask (SOME 0wxffffff00) = SOME 0x1 | ||
554 : | | opndToByteMask (SOME 0wxffffffff) = SOME 0x0 | ||
555 : | | opndToByteMask _ = NONE | ||
556 : | |||
557 : | val opndToMask = opndToByteMask o opndToWord | ||
558 : | in | ||
559 : | case (opndToMask e1, opndToMask e2) of | ||
560 : | (SOME mask, _) => | ||
561 : | orderedArith(I.OPERATE, I.ZAP, e2, T.LI mask, T.LR) | ||
562 : | | (_, SOME mask) => | ||
563 : | orderedArith(I.OPERATE, I.ZAP, e1, T.LI mask, T.LR) | ||
564 : | | _ => commOrderedArith(I.OPERATE, I.AND, e1, e2, T.LR) | ||
565 : | end | ||
566 : | | T.ORB(e1, e2) => commOrderedArith(I.OPERATE, I.BIS, e1, e2, T.LR) | ||
567 : | | T.XORB(e1, e2) => commOrderedArith(I.OPERATE, I.XOR, e1, e2, T.LR) | ||
568 : | | T.SLL(e1, e2, ord) => | ||
569 : | (case opndToWord e2 of | ||
570 : | SOME 0w1 => let val r = T.REG (regAction e1) | ||
571 : | in orderedArith(I.OPERATE, I.ADDL, r, r, T.LR) end | ||
572 : | | SOME 0w2 => orderedArith(I.OPERATE, I.S4ADDL, e1, T.LI 0, T.LR) | ||
573 : | | SOME 0w3 => orderedArith(I.OPERATE, I.S8ADDL, e1, T.LI 0, T.LR) | ||
574 : | | _ => (orderedArith(I.OPERATE, I.SLL, e1, e2, T.LR); | ||
575 : | emit(I.OPERATE{oper=I.SGNXL, ra=rd, rb=zeroOp, rc=rd}); | ||
576 : | rd) | ||
577 : | (*esac*)) | ||
578 : | | T.SRA(e1, e2, ord) => let | ||
579 : | val (reg, opnd) = (regAction e1, opndAction e2) | ||
580 : | in | ||
581 : | (* sign extend longword argument *) | ||
582 : | emit(I.OPERATE{oper=I.SGNXL, ra=reg, rb=zeroOp, rc=reg}); | ||
583 : | emit(I.OPERATE{oper=I.SRA, ra=reg, rb=opnd, rc=rd}); | ||
584 : | rd | ||
585 : | end | ||
586 : | | T.SRL(e1, e2, ord) => let | ||
587 : | val (reg, opnd) = (regAction e1, opndAction e2) | ||
588 : | in | ||
589 : | emit(I.OPERATE{oper=I.ZAP, ra=reg, rb=I.IMMop 0xf0, rc=reg}); | ||
590 : | emit(I.OPERATE{oper=I.SRL, ra=reg, rb=opnd, rc=rd}); | ||
591 : | rd | ||
592 : | end | ||
593 : | | T.DIVT arg => let | ||
594 : | val (reg, opnd) = arithOperands arg | ||
595 : | in | ||
596 : | app emit (PseudoInstrs.divl({ra=reg, rb=opnd, rc=rd}, reduceOpnd)); | ||
597 : | rd | ||
598 : | end | ||
599 : | | T.DIVU arg => let | ||
600 : | val (reg, opnd) = arithOperands arg | ||
601 : | in | ||
602 : | app emit (PseudoInstrs.divlu({ra=reg, rb=opnd, rc=rd}, reduceOpnd)); | ||
603 : | rd | ||
604 : | end | ||
605 : | | T.LOAD32(exp, region) => let | ||
606 : | val (b, d) = eaAction exp | ||
607 : | in emit(I.LOAD{ldOp=I.LDL, r=rd, b=b, d=d, mem=region}); rd | ||
608 : | end | ||
609 : | (* Load and sign-extend a byte from a non-aligned address *) | ||
610 : | | T.LOAD8(exp, region) => let | ||
611 : | monnier | 106 | val tmpR0 = C.newReg() |
612 : | val tmpR1 = C.newReg() | ||
613 : | monnier | 16 | val (rt, disp) = eaAction exp |
614 : | in | ||
615 : | emit(I.LOAD{ldOp=I.LDQ_U, r=tmpR0, b=rt, d=disp, mem=mem}); | ||
616 : | emit(I.LDA{r=tmpR1, b=rt, d=disp}); | ||
617 : | emit(I.OPERATE{oper=I.EXTBL, ra=tmpR0, rb=I.REGop tmpR1, rc=rd}); | ||
618 : | rd | ||
619 : | end | ||
620 : | | T.SEQ(e1, e2) => (stmAction e1; regAction e2) | ||
621 : | | _ => error "regAction" | ||
622 : | end (* regActionRd *) | ||
623 : | |||
624 : | and eaAction exp = let | ||
625 : | fun makeEA(r, n) = | ||
626 : | if ~32768 <= n andalso n <= 32767 then (r, I.IMMop n) | ||
627 : | else let | ||
628 : | monnier | 106 | val tmpR = C.newReg() |
629 : | monnier | 16 | val low = wtoi(Word.andb(itow n, 0w65535))(* unsigned low 16 bits *) |
630 : | val high = n div 65536 | ||
631 : | val (lowsgn, highsgn) = (* Sign-extend *) | ||
632 : | if low <= 32767 then (low, high) else (low -65536, high+1) | ||
633 : | in | ||
634 : | (emit(I.LDAH{r=tmpR, b=r, d=I.IMMop highsgn}); | ||
635 : | (tmpR, I.IMMop lowsgn)) | ||
636 : | end | ||
637 : | in | ||
638 : | case exp | ||
639 : | of T.ADD(exp, T.LI n) => makeEA(regAction exp, n) | ||
640 : | | T.ADD(T.LI n, exp) => makeEA(regAction exp, n) | ||
641 : | | T.ADD(T.CONST c, exp) => (regAction exp, I.CONSTop(c)) | ||
642 : | | T.ADD(exp, T.CONST c) => (regAction exp, I.CONSTop(c)) | ||
643 : | | T.SUB(exp, T.LI n, _) => makeEA(regAction exp, ~n) | ||
644 : | | exp => makeEA(regAction exp, 0) | ||
645 : | end (* eaAction *) | ||
646 : | |||
647 : | and fregAction (T.FREG f) = f | ||
648 : | monnier | 106 | | fregAction exp = fregActionFd(exp, C.newFreg()) |
649 : | monnier | 16 | |
650 : | and fregActionFd(exp, fd) = let | ||
651 : | (* macho comment goes here *) | ||
652 : | fun doFloatArith(farith, e1, e2, fd, order) = let | ||
653 : | val (f1, f2) = orderedFArith(e1, e2, order) | ||
654 : | in | ||
655 : | emit(I.DEFFREG fd); | ||
656 : | emit(I.FOPERATEV{oper=farith, fa=f1, fb=f2, fc=fd}); | ||
657 : | emit(I.TRAPB); | ||
658 : | fd | ||
659 : | end | ||
660 : | in | ||
661 : | case exp | ||
662 : | of T.FREG f => f | ||
663 : | | T.FADDD(e1, e2) => doFloatArith(I.ADDT, e1, e2, fd, T.LR) | ||
664 : | | T.FMULD(e1, e2) => doFloatArith(I.MULT, e1, e2, fd, T.LR) | ||
665 : | | T.FSUBD(e1, e2, ord) => doFloatArith(I.SUBT, e1, e2, fd, ord) | ||
666 : | | T.FDIVD(e1, e2, ord) => doFloatArith(I.DIVT, e1, e2, fd, ord) | ||
667 : | | T.FABSD exp => | ||
668 : | (emit(I.FOPERATE{oper=I.CPYS, fa=zeroR, fb=fregAction exp, fc=fd}); fd) | ||
669 : | | T.FNEGD exp => let | ||
670 : | val fs = fregAction exp | ||
671 : | in | ||
672 : | emit(I.FOPERATE{oper=I.CPYSN, fa=fs, fb=fs, fc=fd}); fd | ||
673 : | end | ||
674 : | | T.CVTI2D exp => let | ||
675 : | val opnd = opndAction exp | ||
676 : | in | ||
677 : | app emit (PseudoInstrs.cvti2d({opnd=opnd, fd=fd},reduceOpnd)); | ||
678 : | fd | ||
679 : | end | ||
680 : | | T.LOADD(exp, region) => let | ||
681 : | val (b, d) = eaAction exp | ||
682 : | in emit(I.FLOAD{ldOp=I.LDT, r=fd, b=b, d=d, mem=region}); fd | ||
683 : | end | ||
684 : | | T.FSEQ(e1, e2) => (stmAction e1; fregAction e2) | ||
685 : | end | ||
686 : | |||
687 : | fun mltreeComp mltree = let | ||
688 : | (* condition code registers are mapped onto general registers *) | ||
689 : | fun cc (T.CCR(T.CC cc)) = T.GPR(T.REG cc) | ||
690 : | | cc x = x | ||
691 : | |||
692 : | fun mltc(T.PSEUDO_OP pOp) = F.pseudoOp pOp | ||
693 : | | mltc(T.DEFINELABEL lab) = F.defineLabel lab | ||
694 : | | mltc(T.ENTRYLABEL lab) = F.entryLabel lab | ||
695 : | | mltc(T.ORDERED mlts) = F.ordered mlts | ||
696 : | | mltc(T.BEGINCLUSTER) = F.beginCluster() | ||
697 : | | mltc(T.CODE stms) = app stmAction stms | ||
698 : | | mltc(T.ENDCLUSTER regmap)= F.endCluster regmap | ||
699 : | | mltc(T.ESCAPEBLOCK regs) = F.exitBlock (map cc regs) | ||
700 : | in mltc mltree | ||
701 : | end | ||
702 : | |||
703 : | val mlriscComp = stmAction | ||
704 : | end | ||
705 : | |||
706 : | |||
707 : | (* | ||
708 : | monnier | 113 | * $Log$ |
709 : | monnier | 16 | *) |
root@smlnj-gforge.cs.uchicago.edu | ViewVC Help |
Powered by ViewVC 1.0.0 |