Home My Page Projects Code Snippets Project Openings SML/NJ
Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

SCM Repository

[smlnj] Annotation of /sml/trunk/src/MLRISC/Doc/mlrisc-ir.html
ViewVC logotype

Annotation of /sml/trunk/src/MLRISC/Doc/mlrisc-ir.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 410 - (view) (download) (as text)
Original Path: sml/branches/SMLNJ/src/MLRISC/Doc/mlrisc-ir.html

1 : monnier 409 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
2 :     <HTML>
3 :     <HEAD>
4 :     <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
5 :     <META NAME="GENERATOR" CONTENT="Mozilla/4.07 [en] (X11; I; Linux 2.2.7 i686) [Netscape]">
6 :     </HEAD>
7 :     <BODY bgcolor="#FFFFFF">
8 :    
9 :     <CENTER>
10 :     <H1>
11 :     <FONT COLOR="#aa0000">The MLRISC IR</FONT></H1></CENTER>
12 :     <h2> Introduction </h2>
13 :    
14 :     In this section we will describe the MLRISC intermediate representation.
15 :    
16 :     <h3>Control Flow Graph</h3>
17 :    
18 :     The control flow graph is the main view of the IR.
19 :     A control flow graph satisfies the following signature:
20 :     <pre>
21 :     signature <a href="../IR/mlrisc-cfg.sig" target=code>CONTROL_FLOW_GRAPH</a> = sig
22 :     structure I : INSTRUCTIONS
23 :     structure B : BLOCK_NAMES
24 :     structure P : PSEUDO_OPS
25 :     structure C : CELLS
26 :     structure W : FIXED_POINT
27 :     sharing I.C = C
28 :    
29 :     {\it definition of type} weight
30 :     {\it definition of type} block_kind
31 :     {\it definition of type} data
32 :     {\it definition of type} block
33 :     {\it definition of type} edge_kind
34 :     {\it definition of type} edge_info
35 :     {\it definition of type} cfg
36 :     {\it operations on a cfg}
37 :     end
38 :     </pre>
39 :    
40 :     The following structures nested within a CFG:
41 :     <ul>
42 :     <li> <tt>I : INSTRUCTIONS</tt> is the instruction structure.
43 :     <li> <tt>B : BLOCK_NAMES</tt> is the block name structure
44 :     <li> <tt>P : PSEUDO_OPS</tt> is the structure with the definition
45 :     of pseudo ops.
46 :     <li> <tt>C : CELLS</tt> is the cells structure describing the
47 :     register conventions of the architecture.
48 :     <li> <tt>W : FIXED_POINT</tt> is a structure that contains
49 :     a fixed point type used in execution frequency annotations.
50 :     </ul>
51 :    
52 :     The type <tt>weight</tt> below is used in execution frequency annotations:
53 :     <pre>
54 :     type weight = W.fixed_point
55 :     </pre>
56 :    
57 :     There are a few different kinds of basic blocks, described
58 :     by the type <tt>block_kind</tt> below:
59 :     <pre>
60 :     datatype block_kind =
61 :     START <em></em>
62 :     | STOP <em></em>
63 :     | FUNCTION_ENTRY <em></em>
64 :     | NORMAL <em></em>
65 :     | HYPERBLOCK <em></em>
66 :     </pre>
67 :    
68 :     A basic block is defined as the datatype <tt>block</tt>, defined below:
69 :     <pre>
70 :     and data = LABEL of Label.label
71 :     | PSEUDO of P.pseudo_op
72 :    
73 :     and block =
74 :     BLOCK of
75 :     { id : int, <em></em>
76 :     kind : block_kind, {\it(* block kind *)}
77 :     name : B.name, {\it(* block name *)}
78 :     freq : weight ref, {\it(* execution frequency *) }
79 :     data : data list ref, {\it(* data preceding block *) }
80 :     labels : Label.label list ref, {\it(* labels on blocks *) }
81 :     insns : I.instruction list ref, {\it(* in rev order *)}
82 :     annotations : Annotations.annotations ref {\it(* annotations *)}
83 :     }
84 :     </pre>
85 :    
86 :     Edges in a CFG are annotated with the type <tt>edge_info</tt>,
87 :     defined below:
88 :     <pre>
89 :     and edge_kind = ENTRY <em></em>
90 :     | EXIT <em></em>
91 :     | JUMP <em></em>
92 :     | FALLSTHRU <em></em>
93 :     | BRANCH of bool <em></em>
94 :     | SWITCH of int <em></em>
95 :     | SIDEEXIT of int <em></em>
96 :    
97 :     and edge_info =
98 :     EDGE of { k : edge_kind, {\it(* edge kind *)}
99 :     w : weight ref, {\it(* edge freq *)}
100 :     a : Annotations.annotations ref {\it(* annotations *)}
101 :     }
102 :     </pre>
103 :    
104 :     Type <tt>cfg</tt> below defines a control flow graph:
105 :     <pre>
106 :     type edge = edge_info edge
107 :     type node = block node
108 :    
109 :     datatype info =
110 :     INFO of { regmap : C.regmap,
111 :     annotations : Annotations.annotations ref,
112 :     firstBlock : int ref,
113 :     reorder : bool ref
114 :     }
115 :     type cfg = (block,edge_info,info) graph
116 :     </pre>
117 :    
118 :     \paragraph{Low-level Interface}
119 :     The following subsection describes the low-level interface to a CFG.
120 :     These functions should be used with care since they do not
121 :     always maintain high-level structural invariants imposed on
122 :     the representation. In general, higher level interfaces exist
123 :     so knowledge of this interface is usually not necessary for customizing
124 :     MLRISC.
125 :    
126 :     Various kinds of annotations on basic blocks are defined below:
127 :     <pre>
128 :     exception LIVEOUT of C.cellset <em></em>
129 :     exception CHANGED of unit -> unit
130 :     exception CHANGEDONCE of unit -> unit
131 :     </pre>
132 :     The annotation <tt>LIVEOUT</tt> is used record live-out information
133 :     on an escaping block.
134 :     The annotations <tt>CHANGED</tt> and <tt>CHANGEDONCE</tt> are used
135 :     internally for maintaining views on a CFG. These should not be used
136 :     directly.
137 :    
138 :     The following are low-level functions for building new basic blocks.
139 :     The functions <tt>new</tt><em>XXX</em> build empty basic blocks of a specific
140 :     type. The function <tt>defineLabel</tt> returns a label to a basic block;
141 :     and if one does not exist then a new label will be generated automatically.
142 :     The functions <tt>emit</tt> and <tt>show_block</tt> are low-level
143 :     routines for displaying a basic block.
144 :     <pre>
145 :     val newBlock : int * B.name -> block <em></em>
146 :     val newStart : int -> block <em></em>
147 :     val newStop : int -> block <em></em>
148 :     val newFunctionEntry : int -> block <em></em>
149 :     val copyBlock : int * block -> block <em></em>
150 :     val defineLabel : block -> Label.label <em></em>
151 :     val emit : C.regmap -> block -> unit <em></em>
152 :     val show_block : C.regmap -> block -> string
153 :     </pre>
154 :    
155 :     Methods for building a CFG are listed as follows:
156 :     <pre>
157 :     val cfg : info -> cfg
158 :     val new : C.regmap -> cfg
159 :     val subgraph : cfg -> cfg
160 :     val init : cfg -> unit
161 :     val changed : cfg -> unit
162 :     val removeEdge : cfg -> edge -> unit
163 :     </pre>
164 :     Again, these methods should be used only with care.
165 :    
166 :     The following functions allow the user to extract low-level information
167 :     from a flowgraph. Function <tt>regmap</tt> returns the current register map.
168 :     Function <tt>regmap</tt> returns a function that lookups the current register
169 :     map. Function <tt>liveOut</tt> returns liveOut information from a block;
170 :     it returns the empty cellset if the block is not an escaping block.
171 :     Function <tt>fallsThruFrom</tt> takes a node id <em>v</em> and locates the
172 :     block <em>u</em> (if any) that flows into <em>v</em> without going through a branch
173 :     instruction. Similarly, the function <tt>fallsThruTo</tt> takes
174 :     a node id <em>u</em> and locates the block (if any) that <em>u</em> flows into
175 :     with going through a branch instruction. If <em>u</em> falls through to
176 :     <em>v</em> in any feasible code layout <em>u</em> must preceed <em>v</em>.
177 :     <pre>
178 :     val regmap : cfg -> C.regmap
179 :     val reglookup : cfg -> C.register -> C.register
180 :     val liveOut : block -> C.cellset
181 :     val fallsThruFrom : cfg * node_id -> node_id option
182 :     val fallsThruTo : cfg * node_id -> node_id option
183 :     </pre>
184 :    
185 :     To support graph viewing of a CFG, the following low-level
186 :     primitives are provided:
187 :     <pre>
188 :     val viewStyle : cfg -> (block,edge_info,info) GraphLayout.style
189 :     val viewLayout : cfg -> GraphLayout.layout
190 :     val headerText : block -> string
191 :     val footerText : block -> string
192 :     val subgraphLayout : { cfg : cfg, subgraph : cfg } -> GraphLayout.layout
193 :     </pre>
194 :    
195 :     Finally, a miscellany function for control dependence graph building.
196 :     <pre>
197 :     val cdgEdge : edge_info -> bool <em></em>
198 :     </pre>
199 :    
200 :     <h3>IR</h3>
201 :     The MLRISC intermediate representation is a composite
202 :     view of various compiler data structures, including the control
203 :     flow graph, (post-)dominator trees, control dependence graph, and
204 :     loop nesting tree. Basic compiler optimizations in MLRISC
205 :     operate on this data structure; advance optimizations
206 :     operate on more complex representations which use this
207 :     representation as the base layer.
208 :     \begin{wrapfigure}{r}{4.5in}
209 :     \begin{Boxit}
210 :     \psfig{figure=Figures/mlrisc-ir.eps,width=4.5in}
211 :     \end{Boxit}
212 :     \caption{The MLRISC IR}
213 :     \end{wrapfigure}
214 :    
215 :     This IR provides a few additional functionalities:
216 :     <ul>
217 :     <li> {\bf Edge frequencies} -- execution frequencies
218 :     are maintained on all control flow edges.
219 :     <li> {\bf Extensible annotations} -- semantics information can be
220 :     represented as annotations on the graph.
221 :     <li> {\bf Multiple facets} --
222 :     Facets are high-level views that automatically keep themselves
223 :     up-to-date. Computed facets are cached and out-of-date facets
224 :     are recomputed by demand.
225 :     The IR defines a mechanism to attach multiple facets to the IR.
226 :     </ul>
227 :    
228 :     The signature of the IR is listed in Figure~\ref{fig:mlrisc-ir}.
229 :     \begin{Figure}
230 :     <pre>
231 :     signature <a href="../IR/mlrisc-ir.sig" target=code>MLRISC_IR</a> = sig
232 :     structure I : INSTRUCTIONS
233 :     structure CFG : CONTROL_FLOW_GRAPH
234 :     structure Dom : DOMINATOR_TREE
235 :     structure CDG : CONTROL_DEPENDENCE_GRAPH
236 :     structure Loop : LOOP_STRUCTURE
237 :     structure Util : CFG_UTIL
238 :     sharing Util.CFG = CFG
239 :     sharing CFG.I = I
240 :     sharing Loop.Dom = CDG.Dom = Dom
241 :    
242 :     type cfg = CFG.cfg
243 :     type IR = CFG.cfg
244 :     type dom = (CFG.block,CFG.edge_info,CFG.info) Dom.dominator_tree
245 :     type pdom = (CFG.block,CFG.edge_info,CFG.info) Dom.postdominator_tree
246 :     type cdg = (CFG.block,CFG.edge_info,CFG.info) CDG.cdg
247 :     type loop = (CFG.block,CFG.edge_info,CFG.info) Loop.loop_structure
248 :    
249 :     val dom : IR -> dom
250 :     val pdom : IR -> pdom
251 :     val cdg : IR -> cdg
252 :     val loop : IR -> loop
253 :    
254 :     val changed : IR -> unit
255 :     val memo : (IR -> 'facet) -> IR -> 'facet
256 :     val addLayout : string -> (IR -> GraphLayout.layout) -> unit
257 :     val view : string -> IR -> unit
258 :     val views : string list -> IR -> unit
259 :     val viewSubgraph : IR -> cfg -> unit
260 :     end
261 :     </pre>
262 :     \caption{\label{fig:mlrisc-ir}MLRISC IR}
263 :     \end{Figure}
264 :    
265 :     The following facets are predefined: dominator, post-dominator tree,
266 :     control dependence graph and loop nesting structure.
267 :     The functions <tt>dom</tt>, <tt>pdom</tt>, <tt>cdg</tt>, <tt>loop</tt>
268 :     are <font color="#ff0000">facet extraction</font> methods that
269 :     compute up-to-date views of these facets.
270 :    
271 :     The following protocol is used for facets:
272 :     <ul>
273 :     <li> {\bf When the IR is changed},
274 :     the function <tt>changed</tt> should be called to
275 :     signal that all facets attached to the IR should be updated.
276 :     <li> {\bf To add a new facet} of type <tt>F</tt> that is computed by demand,
277 :     the programmer has to provide a facet construction
278 :     function <tt>f : IR -> F</tt>. Call the function <tt>mem</tt>
279 :     to register the new facet. For example, let <tt>val g = memo f</tt>.
280 :     Then the function <tt>g</tt> can be used to as a new facet extraction
281 :     function for facet <tt>F</tt>.
282 :     <li> {\bf To register a graph viewing function}, call
283 :     the function <tt>addLayout</tt> and provide an appropriate
284 :     graph layout function. For example, we can say
285 :     <tt>addLayout "F" layoutF</tt> to register a graph layout function
286 :     for a facet called ``F''.
287 :     </ul>
288 :    
289 :     To view an IR, the functions <tt>view</tt>, <tt>views</tt> or
290 :     <tt>viewSubgraph</tt> can be used. They have the following interpretation:
291 :     <ul>
292 :     <li> <tt>view</tt> computes a layout for one facet of the IR and displays
293 :     it. The predefined facets are called
294 :     ``dom'', ``pdom'', ``cdg'', ``loop.'' The IR can be
295 :     viewed as the facet ``cfg.'' In addition, there is a layout
296 :     named ``doms'' which displays the dominator tree and the post-dominator
297 :     tree together, with the post-dominator inverted.
298 :     <li> <tt>views</tt> computes a set of facets and displays it together
299 :     in one single picture.
300 :     <li> <tt>viewSubgraph</tt> layouts a subgraph of the IR.
301 :     This creates a picture with the subgraph highlighted and embedded
302 :     in the whole IR.
303 :     </ul>
304 :    
305 :     <h3>Building a CFG</h3>
306 :    
307 :     There are two basic methods of building a CFG:
308 :     <ul>
309 :     <li> convert a sequence of machine instructions
310 :     into a CFG through the emitter interface, described below, and
311 :     <li> convert it from a <font color="#ff0000">cluster</font>, which is
312 :     the basic linearized representation used in the \MLRISC{} system.
313 :     </ul>
314 :     The first method requires you to perform instruction selection
315 :     from a compiler front-end, but allows you to bypass all other
316 :     \MLRISC{} phases if desired. The second method allows you
317 :     to take advantage of various \MLRISC's instruction selection modules
318 :     currently available. We describe these methods in this section.
319 :    
320 :     \paragraph{Directly from Instructions}
321 :     Signature <tt>CODE_EMITTER</tt> below describes an abstract emitter interface
322 :     for accepting a linear stream of instructions from a source
323 :     and perform a sequence of actions based on this
324 :     stream\footnote{Unlike the signature {\tt EMITTER\_NEW} or
325 :     {\tt FLOWGRAPH\_GEN}, it has the advantage that it is not
326 :     tied into any form of specific flowgraph representation.}.
327 :    
328 :     <pre>
329 :     signature <a href="../extensions/code-emitter.sig" target=code>CODE_EMITTER</a> = sig
330 :     structure I : INSTRUCTIONS
331 :     structure C : CELLS
332 :     structure P : PSEUDO_OPS
333 :     structure B : BLOCK_NAMES
334 :     sharing I.C = C
335 :    
336 :     type emitter =
337 :     { defineLabel : Label.label -> unit,
338 :     entryLabel : Label.label -> unit,
339 :     exitBlock : C.cellset -> unit,
340 :     pseudoOp : P.pseudo\_op -> unit,
341 :     emitInstr : I.instruction -> unit,
342 :     blockName : B.name -> unit,
343 :     comment : string -> unit,
344 :     init : int -> unit,
345 :     finish : unit -> unit
346 :     }
347 :     end
348 :     </pre>
349 :    
350 :     The code emitter interface has the following informal protocol.
351 :     \begin{methods}
352 :     init(<em>n</em>) & Initializes the emitter and signals that
353 :     the back-end should
354 :     allocate space for <em>n</em> bytes of machine code.
355 :     The number is ignored for non-machine code back-ends. \\
356 :     defineLabel(<em>l</em>) & Defines a new label <em>l</em> at the current position.\\
357 :     entryLabel(<em>l</em>) & Defines a new entry label <em>l</em> at the current position.
358 :     An entry label defines an entry point into the current flow graph.
359 :     Note that multiple entry points are allowed\\
360 :     exitBlock(<em>c</em>) & Defines an exit at the current position.
361 :     The cellset <em>c</em> represents the live-out information \\
362 :     pseudOp(<em>p</em>) & Emits an pseudo op <em>p</em> at the current position \\
363 :     emitInstr(<em>i</em>) & Emits an instruction <em>i</em> at the current position \\
364 :     blockName(<em>b</em>) & Changes the block name to <em>b</em> \\
365 :     comment(<em>msg</em>) & Emits a comment <em>msg</em> at the current position \\
366 :     finish & Signals that the use of the emitter is finished.
367 :     The emitter is free to perform its post-processing functions.
368 :     When this is finished the CFG is built.
369 :     \end{methods}
370 :    
371 :     The functor <tt>ControlFlowGraphGenFn</tt> below can be
372 :     used to create a CFG builder that uses the <tt>CODE_EMITTER</tt> interface.
373 :     <pre>
374 :     signature <a href="../IR/mlrisc-cfg-gen.sig" target=code>CONTROL_FLOW_GRAPH_GEN</a> = sig
375 :     structure CFG : CONTROL_FLOW_GRAPH
376 :     structure Emitter : CODE_EMITTER
377 :     sharing Emitter.I = CFG.I
378 :     sharing Emitter.P = CFG.P
379 :     val emitter : CFG.cfg -> Emitter.emitter
380 :     end
381 :     functor <a href="../IR/mlrisc-cfg-gen.sml" target=code>ControlFlowGraphGenFn</a>
382 :     (structure CFG : CONTROL_FLOW_GRAPH
383 :     structure Emitter : CODE_EMITTER
384 :     structure P : INSN_PROPERTIES
385 :     sharing CFG.I = Emitter.I = P.I
386 :     sharing CFG.P = Emitter.P
387 :     sharing CFG.B = Emitter.B
388 :     ) : CONTROL_FLOW_GRAPH_GEN
389 :     </pre>
390 :    
391 :     \paragraph{Cluster to CFG}
392 :    
393 :     The core \MLRISC{} system implements many instruction selection
394 :     front-ends. The result of an instruction selection module is a linear
395 :     code layout block called a cluster. The functor <tt>Cluster2CFGFn</tt> below
396 :     generates a translator that translates a cluster into a CFG:
397 :     <pre>
398 :     signature <a href="../IR/mlrisc-cluster2cfg.sig" target=code>CLUSTER2CFG</a> = sig
399 :     structure CFG : CONTROL_FLOW_GRAPH
400 :     structure F : FLOWGRAPH
401 :     sharing CFG.I = F.I
402 :     sharing CFG.P = F.P
403 :     sharing CFG.B = F.B
404 :     val cluster2cfg : F.cluster -> CFG.cfg
405 :     end
406 :     functor <a href="../IR/mlrisc-cluster2cfg.sml" target=code>Cluster2CFGFn</a>
407 :     (structure CFG : CONTROL_FLOW_GRAPH
408 :     structure F : FLOWGRAPH
409 :     structure P : INSN_PROPERTIES
410 :     sharing CFG.I = F.I = P.I
411 :     sharing CFG.P = F.P
412 :     sharing CFG.B = F.B
413 :     ) : CLUSTER2CFG
414 :     </pre>
415 :    
416 :     \paragraph{CFG to Cluster}
417 :    
418 :     The basic \MLRISC{} system also implements many back-end functions
419 :     such as register allocation, assembly output and machine code output.
420 :     These modules all utilize the cluster representation. The
421 :     functor <a href="../IR/mlrisc-cfg2cluster.sml" target=code>CFG2ClusterFn</a>
422 :     below generates a translator
423 :     that converts a CFG into a cluster. With the previous functor,
424 :     the CFG and the cluster presentation can be freely inter-converted.
425 :     <pre>
426 :     signature <a href="../IR/mlrisc-cfg2cluster.sig" target=code>CFG2CLUSTER</a> = sig
427 :     structure CFG : CONTROL_FLOW_GRAPH
428 :     structure F : FLOWGRAPH
429 :     sharing CFG.I = F.I
430 :     sharing CFG.P = F.P
431 :     sharing CFG.B = F.B
432 :     val cfg2cluster : { cfg : CFG.cfg, relayout : bool } -> F.cluster
433 :     end
434 :     functor <a href="../IR/mlrisc-cfg2cluster.sml" target=code>CFG2ClusterFn</a>
435 :     (structure CFG : CONTROL_FLOW_GRAPH
436 :     structure F : FLOWGRAPH
437 :     sharing CFG.I = F.I
438 :     sharing CFG.P = F.P
439 :     sharing CFG.B = F.B
440 :     val patchBranch : {instr:CFG.I.instruction, backwards:bool} ->
441 :     CFG.I.instruction list
442 :     ) : CFG2CLUSTER
443 :     </pre>
444 :    
445 :     When a CFG originates from a cluster, we try to preserve
446 :     the same code layout through out all optimizations when possible.
447 :     The function <tt>cfg2cluster</tt> takes an optional flag
448 :     that specifies we should force the recomputation of
449 :     the code layout of a control flow graph when translating a CFG
450 :     back into a cluster.
451 :    
452 :     <h3>Basic CFG Transformations</h3>
453 :    
454 :     Basic CFG transformations are implemented in the functor
455 :     <tt>CFGUtilFn</tt>. These transformations include splitting edges, merging
456 :     edges, removing unreachable code and tail duplication.
457 :     <pre>
458 :     functor <a href="../IR/mlrisc-cfg-util.sml" target=code>CFGUtilFn</a>
459 :     (structure CFG : CONTROL_FLOW_GRAPH
460 :     structure P : INSN_PROPERTIES
461 :     sharing P.I = CFG.I
462 :     ) : CFG_UTIL
463 :     </pre>
464 :    
465 :     The signature of <tt>CFGUtilFn</tt> is defined below:
466 :     <pre>
467 :     signature <a href="../IR/mlrisc-cfg-util.sig" target=code>CFG_UTIL</a> = sig
468 :     structure CFG : CONTROL_FLOW_GRAPH
469 :     val updateJumpLabel : CFG.cfg -> node_id -> unit
470 :     val mergeEdge : CFG.cfg -> CFG.edge -> bool
471 :     val eliminateJump : CFG.cfg -> node_id -> bool
472 :     val insertJump : CFG.cfg -> node_id -> bool
473 :     val splitEdge : CFG.cfg -> { edge : CFG.edge, jump : bool }
474 :     -> { edge : CFG.edge, node : CFG.node }
475 :     val isMerge : CFG.cfg -> node_id -> bool
476 :     val isSplit : CFG.cfg -> node_id -> bool
477 :     val hasSideExits : CFG.cfg -> node_id -> bool
478 :     val isCriticalEdge : CFG.cfg -> CFG.edge -> bool
479 :     val splitAllCriticalEdges : CFG.cfg -> unit
480 :     val ceed : CFG.cfg -> node_id * node_id -> bool
481 :     val tailDuplicate : CFG.cfg -> { subgraph : CFG.cfg, root : node_id }
482 :     -> { nodes : CFG.node list,
483 :     edges : CFG.edge list }
484 :     val removeUnreachableCode : CFG.cfg -> unit
485 :     val mergeAllEdges : CFG.cfg -> unit
486 :     end
487 :     </pre>
488 :    
489 :     These functions perform the following functions:
490 :     <ul>
491 :     <li> <tt>updateJumpLabel</tt> <em>G u</em>. This function
492 :     updates the label of the branch instruction in a block <em>u</em>
493 :     to be consistent with the control flow edges with source <em>u</em>.
494 :     This is an nop if the CFG is already consistent.
495 :     <li> <tt>mergeEdge</tt> <em>G e</em>. This function merges edge <em>e \equiv u \edge{} v</em>
496 :     in the graph <em>G</em> if possible. This is successful only if
497 :     there are no other edges flowing into <em>v</em> and no other edges
498 :     flowing out from <em>u</em>. It returns true if the merge
499 :     operation is successful. If successful, the nodes <em>u</em> and <em>v</em>
500 :     will be coalesced into the block <em>u</em>. The jump instruction (if any)
501 :     in the node <em>u</em> will also be elided.
502 :     <li> <tt>eliminateJump</tt> <em>G u</em>. This function eliminate the
503 :     jump instruction at the end of block <em>u</em> if it is feasible.
504 :     <li> <tt>insertJump</tt> <em>G u</em>. This function inserts a jump
505 :     instruction in block <em>u</em> if it is feasible.
506 :     <li> <tt>splitEdge</tt> <em>G e</em>. This function
507 :     split the control flow edge <em>e</em>, and return a new edge <em>e'</em> and the
508 :     new block <em>u</em> as return values. It addition, it takes as
509 :     argument a flag <tt>jump</tt>. If this flag is true,
510 :     then a jump instruction is always placed in the
511 :     split; otherwise, we try to eliminate the jump when feasible.
512 :     <li> <tt>isMerge</tt> <em>G u</em>. This function tests whether block <em>u</em>
513 :     is a <font color="#ff0000">merge</font> node. A merge node is a node that
514 :     has two or more incoming flow edges.
515 :     <li> <tt>isSplit</tt> <em>G u</em>. This function tests whether block <em>u</em>
516 :     is a <font color="#ff0000">split</font> node. A split node is a node that
517 :     has two or more outgoing flow edges.
518 :     <li> <tt>hasSideExits</tt> <em>G u</em>. This function tests whether
519 :     a block has side exits <em>G</em>. This assumes that <em>u</em>
520 :     is a hyperblock.
521 :     <li> <tt>isCriticalEdge</tt> <em>G e</em>. This function tests whether
522 :     the edge <em>e</em> is a <font color="#ff0000">critical</font> edge. The
523 :     edge <em>e \equiv u \edge{} v</em> is critical iff
524 :     there are <em>u</em> is merge node and <em>v</em> is a split node.
525 :     <li> <tt>splitAllCriticalEdges</tt> <em>G</em>. This function goes
526 :     through the CFG <em>G</em> and splits
527 :     all critical edges in the CFG.
528 :     This can introduce extra jumps and basic blocks in the program.
529 :     <li> <tt>mustPreceed</tt> <em>G (u,v)</em>. This function
530 :     checks whether two blocks <em>u</em> and <em>v</em> are necessarily adjacent.
531 :     Blocks <em>u</em> and <em>v</em> must be adjacent iff <em>u</em> must preceed <em>v</em>
532 :     in any feasible code layout.
533 :     <li> <tt>tailDuplicate</tt>.
534 :     <pre>
535 :     val tailDuplicate : CFG.cfg -> { subgraph : CFG.cfg, root : node_id }
536 :     -> { nodes : CFG.node list,
537 :     edges : CFG.edge list }
538 :     </pre>
539 :     \begin{Figure}
540 :     %\begin{tabular}{|c|} \hline
541 :     \begin{boxit}
542 :     \cpsfig{figure=Figures/tail-duplication.eps,width=3in}
543 :     \end{boxit}
544 :     %\hline
545 :     %\end{tabular}
546 :     \caption{\label{fig:tail-duplication} Tail-duplication}
547 :     \end{Figure}
548 :    
549 :     This function tail-duplicates the region <tt>subgraph</tt>
550 :     until it only has a single entry <tt>root</tt>.
551 :     Return the set of new nodes and new edges.
552 :     The region is represented as a subgraph view of the CFG.
553 :     Figure~\ref{fig:tail-duplication} illustrates
554 :     this transformation.
555 :    
556 :     <li> <tt>removeUnreachableCode</tt> <em>G</em>. This function
557 :     removes all unreachable code from the graph.
558 :     <li> <tt>mergeAllEdges</tt> <em>G</em>. This function tries to merge all
559 :     the edges in the flowgraph <em>G</em>. Merging is performed in the
560 :     non-increasing order of edge frequencies.
561 :     </ul>
562 :    
563 :     <h3>Dataflow Analysis</h3>
564 :     MLRISC provides a simple customizable module for performing
565 :     iterative dataflow analysis. A dataflow analyzer
566 :     has the following signature:
567 :    
568 :     <pre>
569 :     signature <a href="../IR/dataflow.sig" target=code>DATAFLOW_ANALYZER</a> = sig
570 :     structure CFG : CONTROL_FLOW_GRAPH
571 :     type dataflow_info
572 :     val analyze : CFG.cfg * dataflow_info -> dataflow_info
573 :     end
574 :     </pre>
575 :    
576 :     A dataflow problem is described by the signature <tt>DATAFLOW_PROBLEM</tt>,
577 :     described below:
578 :     <pre>
579 :     signature <a href="../IR/dataflow.sig" target=code>DATAFLOW_PROBLEM</a> = sig
580 :     structure CFG : CONTROL_FLOW_GRAPH
581 :     type domain
582 :     type dataflow_info
583 :     val forward : bool
584 :     val bot : domain
585 :     val == : domain * domain -> bool
586 :     val join : domain list -> domain
587 :     val prologue : CFG.cfg * dataflow_info ->
588 :     CFG.block node ->
589 :     { input : domain,
590 :     output : domain,
591 :     transfer : domain -> domain
592 :     }
593 :     val epilogue : CFG.cfg * dataflow_info ->
594 :     { node : CFG.block node,
595 :     input : domain,
596 :     output : domain
597 :     } -> unit
598 :     end
599 :     </pre>
600 :     This description contains the following items
601 :     <ul>
602 :     <li> <tt>type domain</tt> is the abstract lattice domain <em>D</em>.
603 :     <li> <tt>type dataflow_info</tt> is where the dataflow information
604 :     is stored.
605 :     <li> <tt>forward</tt> is true iff the dataflow problem is in the
606 :     forward direction
607 :     <li> <tt>bot</tt> is the bottom element of <em>D</em>.
608 :     <li> <tt>==</tt> is the equality function on <em>D</em>.
609 :     <li> <tt>join</tt> is the least-upper-bound function on <em>D</em>.
610 :     <li> <tt>prologue</tt> is a user-supplied function that performs
611 :     pre-processing and setup. For each CFG node <em>X</em>, this function
612 :     computes
613 :     <ul>
614 :     <li> <tt>input</tt> -- which is the initial input value of <em>X</em>
615 :     <li> <tt>output</tt> -- which is the initial output value of <em>X</em>
616 :     <li> <tt>transfer</tt> -- which is the transfer function on <em>X</em>.
617 :     </ul>
618 :     <li> <tt>epilogue</tt> is a function that performs post-processing.
619 :     It visits each node <em>X</em> in the flowgraph and return the resulting
620 :     <tt>input</tt> and <tt>output</tt> value for <em>X</em>.
621 :     </ul>
622 :    
623 :     To generate a new dataflow analyzer from a dataflow problem,
624 :     the functor <tt>DataflowFn</tt> can be used:
625 :     <pre>
626 :     functor <a href="../IR/dataflow.sml" target=code>DataflowFn</a>(P : DATAFLOW_PROBLEM) : DATAFLOW_ANALYZER =
627 :     </pre>
628 :    
629 :     <h3>Static Branch Prediction</h3>
630 :    
631 :     <h3>Branch Optimizations </h3>
632 :    
633 :    
634 :    
635 :     <HR>
636 :     <FONT SIZE="-2">
637 :     <ADDRESS>
638 :     <A HREF="mailto:leunga@cs.nyu.edu">Allen Leung</A></ADDRESS>
639 :     <BR>
640 :    
641 :     </BODY>
642 :     </HTML>

root@smlnj-gforge.cs.uchicago.edu
ViewVC Help
Powered by ViewVC 1.0.0