Home My Page Projects Code Snippets Project Openings SML/NJ
 Summary Activity Forums Tracker Lists Tasks Docs Surveys News SCM Files

# SCM Repository

[smlnj] Annotation of /sml/trunk/src/MLRISC/Doc/graphs.html
 [smlnj] / sml / trunk / src / MLRISC / Doc / graphs.html

# Annotation of /sml/trunk/src/MLRISC/Doc/graphs.html

 1 : monnier 409 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 :
10 :

11 : The Graph Library

12 : 13 :

Overview

14 : 15 : Graphs are the most fundamental data structure in the MLRISC system. 16 : MLRISC now contains an extensive library for working with graphs. 17 :

18 : All graphs in MLRISC 19 : are modeled as edge- and node-labeled directed multi-graphs. 20 : Briefly, this means that nodes and edges can carry user supplied data, and 21 : multiple directed edges can be attached between any two nodes. 22 : Self-loops are also allowed. 23 : 24 :

25 : A node is uniquely identified by its >node_id, which is 26 : simply an integer. Node ids can be assigned externally 27 : by the user, or else generated automatically by a graph. All graphs 28 : keep track of all node ids that are currently used, 29 : and the method new_id : unit -> node_id generates a new unused id. 30 : 31 :

32 : A node is modeled as a node id and node label pair, (i,l). 33 : An edge is modeled as a triple i -> j, which contains 34 : the source and target node ids i and j, 35 : and the edge label l. These types are defined as follows: 36 :

37 :                  type 'n node = node_id * 'n
38 :                  type 'e edge = node_id * node_id * 'e
39 :
40 : 41 :

The graph signature

42 : 43 : All graphs are accessed through an abstract interface 44 : of the polymorphic type ('n,'e,'g) graph. 45 : Here, 'n is the type of the node labels, 'e is the type 46 : of the edge labels, and 'g is the type of any extra information 47 : embedded in a graph. We call the latter graph info. 48 : 49 : Formally, a graph G is a quadruple (V,L,E,I) 50 : where V is a set of node ids, L : V \to \alpha is a node labeling 51 : function from vertices to node labels, E is a multi-set 52 : of labeled-edges of type V \times V \times \beta, and I: \gamma 53 : is the graph info. 54 : 55 : The interface of a graph is packaged into a 56 : record of methods that manipulate the base representation: 57 :
58 :                  signature GRAPH = sig
59 :                  type node_id = int
60 :                  type 'n node = node_id * 'n
61 :                  type 'e edge = node_id * node_id * 'e
62 :
63 :                  exception Graph of string
64 :                  exception Subgraph
65 :                  exception NotFound
66 :                  exception Unimplemented
68 :
69 :                  datatype ('n,'e,'g) graph = GRAPH of ('n,'e,'g) graph_methods
70 :                  withtype ('n,'e,'g) graph_methods =
71 :                  {  name            : string,
72 :                  graph_info      : 'g,
73 :                  (* selectors *)
74 :                  (* mutators *)
75 :                  (* iterators *)
76 :                  }
77 :                  end
78 :
79 : 80 : A few exceptions are predefined in this signature, which have 81 : the following informal interpretation. 82 : Exception Graph is raised when a bug is encountered. 83 : The exception Subgraph is raised if certain semantics constraints 84 : imposed on a graph are violated. 85 : The exception NotFound is raised if lookup of a node id fails. 86 : The exception Unimplemented is raised if a certain feature 87 : is accessed but is undefined on the graph. The exception 88 : Readonly is raised if the graph is readonly and an update operation 89 : is attempted. 90 : 91 :

Selectors

92 : 93 : Methods that access the structure of a graph are listed below: 94 :
95 :                  nodes : unit -> \alpha node list &
96 :                  Return a list of all nodes in a graph \\
97 :                  edges : unit -> \beta edge list &
98 :                  Return a list of all edges in a graph \\
99 :                  order : unit -> int &
100 :                 Return the number of nodes in a graph.  The graph is empty
101 :                 if its order is zero \\
102 :                 size : unit -> int &
103 :                 Return the number of edges in a graph \\
104 :                 capacity : unit -> int &
105 :                 Return the maximum node id in the graph, plus 1.
106 :                 This can be used as a new id  \\
107 :                 succ : node_id -> node_id list &
108 :                 Given a node id i, return the node ids of all its successors,
109 :                 i.e. { j | i \edge{l} j \in E}. \\
110 :                 pred : node_id -> node_id list &
111 :                 Given a node id j, return the node ids of all its predecessors,
112 :                 i.e. { i | i \edge{l} j \in E}. \\
113 :                 out_edges : node_id -> \beta edge list &
114 :                 Given a node id i, return all the out-going edges from node i,
115 :                 i.e. all edges whose source is i. \\
116 :                 in_edges : node_id -> \beta edge list &
117 :                 Given a node id j, return all the in-coming edges from node j,
118 :                 i.e. all edges whose target is j. \\
119 :                 has_edge : node_id * node_id -> bool &
120 :                 Given two node ids i and j, find out if an edge
121 :                 with source i and target j exists. \\
122 :                 has_node : node_id -> bool &
123 :                 Given a node id i, find out if a node of id i exists. \\
124 :                 node_info : node_id -> \alpha &
125 :                 Given a node id, return its node label.  If the node does not
126 :                 exist, raise exception NotFound. \\
127 :
128 : 129 :

Graph hierarchy

130 : 131 : A graph G may in fact be a subgraph of a base graph G', or 132 : obtained from G' via some transformation T. 133 : In such cases the following methods may be used to determine of the 134 : relationship between G and G'. 135 : An entry edge is an edge 136 : in G' that terminates at a node in G, but is not an edge in G. 137 : Similarly, an exit edge is an edge in G' that originates from 138 : a node in G, but is not an edge in G. An entry node 139 : is a node in G that has an incoming entry edge. An \define{exit 140 : node} is a node in G that has an out-going exit edge. If G is not 141 : a subgraph, all these methods will return nil. 142 : \begin{methods} 143 : entries : unit -> node_id list & 144 : Return the node ids of all the entry nodes. \\ 145 : exits : unit -> node_id list & 146 : Return the node ids of all the exit nodes. \\ 147 : entry_edges : node_id -> \beta edge list & 148 : Given a node id i, return all the entry edges whose sources are 149 : i. \\ 150 : exit_edges : node_id -> \beta edge list & 151 : Given a node id i, return all the exit edges whose targets are i. 152 : \end{methods} 153 : 154 : \pagebreak 155 :

Mutators

156 : 157 : Methods to update a graph are listed below: 158 : \begin{methods} 159 : new_id : unit -> node_id & 160 : Return a unique node id guaranteed to be 161 : absent in the current graph. \\ 162 : add_node : \alpha node -> unit & 163 : Insert node into the graph. If a node of the same node id 164 : already exists, replace the old node with the new. \\ 165 : add_edge : \beta edge -> unit & 166 : Insert an edge into the graph. \\ 167 : remove_node : node_id -> unit & 168 : Given a node id n, remove the node with the node id from the graph. 169 : This also automatically removes all edges with source or target n. \\ 170 : set_out_edges : node_id * \beta edge list -> unit & 171 : Given a node id n, and a list of edges e_1,\ldots,e_n 172 : with sources n, replace all out-edges of n by e_1,\ldots,e_n. \\ 173 : set_in_edges : node_id * \beta edge list -> unit & 174 : Given a node id n, and a list of edges e_1,\ldots,e_n 175 : with targets n, replace all in-edges of n by e_1,\ldots,e_n. \\ 176 : set_entries : node_id list -> unit & 177 : Set the entry nodes of a graph. \\ 178 : set_exits : node_id list -> unit & 179 : Set the exit nodes of a graph. \\ 180 : garbage_collect : unit -> unit & 181 : Reclaim all node ids of nodes that have been removed by 182 : remove_node}. Subsequent \sml{new_id will reuse these 183 : node ids. \\ 184 : \end{methods} 185 : 186 :

Iterators

187 : 188 : Two primitive iterators are supported in the graph interface. 189 : Method forall_nodes iterates over all the nodes in a graph, 190 : while method forall_edges iterates over all the edges. 191 : Other more complex iterators can be found in other modules. 192 : \begin{methods} 193 : forall_nodes : (\alpha node -> unit) -> unit & 194 : Given a function f on nodes, apply f on all the nodes in the graph. \\ 195 : forall_edges : (\beta edge -> unit) -> unit & 196 : Given a function f on edges, apply f on all the edges in the graph. 197 : \end{methods} 198 : 199 :

Manipulating a graph

200 : 201 : Since operations on the graph type are packaged into 202 : a record, an object oriented'' style of graph manipulation should be used. 203 : For example, if G is a graph object, then we can obtain all the 204 : nodes and edges of G as follows. 205 :
206 :                 val GRAPH g = G
207 :                 val edges = #edges g ()
208 :                 val nodes = #nodes g ()
209 :
210 : We can view #edges g} as sending the message to \sml{G. 211 : While all this seems like mere syntactic deviation from the usual 212 : signature/structure approach, there are two crucial differences which 213 : we will exploit: 214 : {\em (i)} records are first class objects 215 : while structures are not (consequently 216 : late binding of methods'' and cannot be easily simulated on the 217 : structure level); {\em (ii)} recursion 218 : is possible on the type level, while recursive structures are not available. 219 : The extra flexibility of this choice becomes apparent with the 220 : introduction of views later. 221 : 222 :

Creating a Graph

223 : 224 : A graph implementation has the following signature 225 :
226 :                 signature GRAPH_IMPLEMENTATION = sig
227 :                 val graph : string * 'g * int -> ('n,'e,'g) graph
228 :                 end
229 :
230 : The function graph takes a string (the name of the graph), 231 : graph info, and a default size as arguments and create an empty graph. 232 : 233 : The functor DirectedGraphFn: 234 :
235 :                 functor DirectedGraphFn(A : ARRAY_SIG) : GRAPH_IMPLEMENTATION
236 :
237 : implements a graph using adjacency lists as internal representation. 238 : It takes an array type as a parameter. For graphs with 239 : node ids that are dense enumerations, the DynamicArray structure 240 : should be used as the parameter to this functor. 241 : The structure DirectedGraph is predefined as follows: 242 :
243 :                 structure DirectedGraph = DirectedGraphFn(DynamicArray)
244 :
245 : 246 : For node ids that are sparse enumerations, the structure HashArray, 247 : which implements integer-keyed hash tables 248 : with the signature of arrays, can be used 249 : as argument to DirectedGraphFn. 250 : For graphs with fixed sizes determined at creation time, 251 : the structure Array can be used (see also 252 : functor /tt UndoableArrayFn, 253 : which creates arrays with undoable updates, and transaction-like semantics.) 254 : 255 :

Basic Graph Algorithms

256 : 257 :

Depth-/Breath-First Search

258 : 259 :
260 :                 val dfs : ('n,'e,'g) graph  ->
261 :                 (node_id -> unit) ->
262 :                 ('e edge -> unit) ->
263 :                 node_id list -> unit
264 :
265 : The function dfs takes as arguments a graph, 266 : a function f : node_id -> unit, a function 267 : g : 'e edge -> unit, and a 268 : set of source vertices. It performs depth first search on the 269 : graph. The function f is invoked 270 : whenever a new node is being visited, while the function g 271 : is invoked whenever a new edge is being traversed. 272 : This algorithm has running time O(|V|+|E|). 273 : 274 :
275 :                 val dfsfold : ('n,'e,'g) graph  ->
276 :                 (node_id * 'a -> 'a) ->
277 :                 ('e edge * 'b -> 'a) ->
278 :                 node_id list -> 'a * 'b -> 'a * 'b
279 :                 val dfsnum :  ('n,'e,'g) graph  ->
280 :                 (node_id * 'a -> 'a) ->
281 :                 { dfsnum : int array, compnum : int array }
282 :
283 : 284 : The function bfs} is similar to \sml{dfs 285 : except that breath first search is performed. 286 :
287 :                 val bfs : ('n,'e,'g) graph  ->
288 :                 (node_id -> unit) ->
289 :                 ('e edge -> unit) ->
290 :                 node_id list -> unit
291 :                 val bfsdist : ('n,'e,'g) graph -> node_id list -> int array
292 :
293 : 294 :

Preorder/Postorder numbering

295 :
296 :                 val preorder_numbering  : ('n,'e,'g) graph -> int -> int array
297 :                 val postorder_numbering : ('n,'e,'g) graph -> int -> int array
298 :
299 : Both these functions take a tree T and a root v, and return 300 : the preorder numbering and the postorder numbering of the tree respectively. 301 : 302 :

Topological Sort

303 :
304 :                 val topsort : ('n,'e,'g) graph -> node_id list -> node_id list
305 :
306 : The function 307 : topsort takes a graph G and a set of source vertices S 308 : as arguments. It returns a topological sort of all the nodes reachable from 309 : the set S. 310 : This algorithm has running time O(|S|+|V|+|E|). 311 : 312 :

Strongly Connected Components

313 :
314 :                 val strong_components : ('n,'e,'g) graph ->
315 :                 (node_id list * 'a -> 'a) -> 'a -> 'a
316 :
317 : The function strong_components takes a graph G and 318 : an aggregate function f with type 319 :
320 :                 node_id list * 'a -> 'a
321 :
322 : \noindent and an identity element x : 'a as arguments. 323 : Function f is invoked with a strongly connected component 324 : (represented as a list of node ids) as each is discovered. 325 : That is, the function strong_components computes 326 : $f(SCC_n,f(SCC_{n-1},\ldots f(SCC_1,x)))$ 327 : where SCC_1,\ldots,SCC_n are the strongly connected components 328 : in topological order. This algorithm has running time O(|V|+|E|). 329 : 330 :

Biconnected Components

331 :
332 :                 val biconnected_components : ('n,'e,'g) graph ->
333 :                 ('e edge list * 'a -> 'a) -> 'a -> 'a
334 :
335 : The function biconnected_components takes a graph G and 336 : an aggregate function f with type 337 :
338 :                 'e edge list * 'a -> 'a
339 :
340 : \noindent and an identity element x : 'a as arguments. 341 : Function f is invoked with a biconnected component 342 : (represented as a list of edges) as each is discovered. 343 : That is, the function biconnected_components computes 344 : $f(BCC_n,f(BCC_{n-1},\ldots f(BCC_1,x)))$ 345 : where BCC_1,\ldots,BCC_n are the biconnected components. 346 : This algorithm has running time O(|V|+|E|). 347 : 348 :

Cyclic Test

349 :
350 :                 val is_cyclic : ('n,'e,'g) graph -> bool
351 :
352 : Function is_cyclic tests if a graph is cyclic. 353 : This algorithm has running time O(|V|+|E|). 354 : 355 :

Enumerate Simple Cycles

356 :
357 :                 val cycles : ('n,'e,'g) graph -> ('e edge list * 'a -> 'a) -> 'a ->'a
358 :
359 : A simple cycle is a circuit that visits each vertex only once. 360 : The function cycles enumerates all simple cycles in a graph G. 361 : It takes as argument an aggregate function f of type 362 :
363 :                 'e edge list * 'a -> 'a
364 :
365 : and an identity element e, and computes as result the expression 366 : $f(c_n,f(c_{n-1},f(c_{n-2},\ldots f(c_1,e))))$ 367 : where c_1,\ldots,c_n are all the simple cycles in the graph. 368 : All cycles c_1,\ldots,c_n are guaranteed to be distinct. 369 : A cycle is represented as a sequence of 370 : adjacent edges, i.e. in the order of 371 : $v_1 \to v_2, v_2 \to v_3, v_3 \to v_4, \ldots, v_{n-1} \to v_n, 372 : v_n \to v_1$ 373 : Our implementation works by first decomposing the graph into 374 : its strongly connected components, then uses backtracking to enumerate 375 : simple cycles in each component. 376 :

Minimal Cost Spanning Tree

377 :
378 :                 signature MIN_COST_SPANNING_TREE = sig
379 :                 exception Unconnected
380 :
381 :                 val spanning_tree : { weight    : 'e edge -> 'a,
382 :                 <         : 'a * 'a -> bool
383 :                 } -> ('n, 'e, 'g) graph
384 :                 -> ('e edge * 'a -> 'a) -> 'a -> 'a
385 :                 end
386 :                 structure Kruskal : MIN_COST_SPANNING_TREE
387 :
388 : 389 : Structure Kruskal implements Kruskal's algorithm for 390 : computing a minimal cost spanning tree of a graph. 391 : The function spanning_tree takes as arguments: 392 :
393 :
• a weight function which when given an edge returns its weight 394 :
• an ordering function <, which is used to compare the weights 395 :
• a graph G 396 :
• an accumulator function f, and 397 :
• an identity element x 398 :
399 : The function spanning_tree computes 400 : $f(e_{n},f(e_{n-1},\ldots f(e_1,x)))$ 401 : where e_1,\ldots,e_n are the edges in a minimal cost spanning tree 402 : of the graph. 403 : The exception Unconnected is raised if the graph is unconnected. 404 : 405 :

Abelian Groups

406 : Graph algorithms that deal with numeric weights or distances 407 : are parameterized with respect to the signatures 408 : ABELIAN_GROUP} or \sml{ABELIAN_GROUP_WITH_INF. 409 : These are defined as follows: 410 :
411 :                 signature ABELIAN_GROUP = sig
412 :                 type elem
413 :                 val +    : elem * elem -> elem
414 :                 val -    : elem * elem -> elem
415 :                 val      : elem -> elem
416 :                 val zero : elem
417 :                 val <    : elem * elem -> bool
418 :                 val ==   : elem * elem -> bool
419 :                 end
420 :                 signature ABELIAN_GROUP_WITH_INF = sig
421 :                 include ABELIAN_GROUP
422 :                 val inf : elem
423 :                 end
424 :
425 : Signature ABELIAN_GROUP specifies an ordered commutative group, 426 : while signature ABELIAN_GROUP_WITH_INF specifies an ordered commutative 427 : group with an infinity element inf which satisfies 428 : {\em(i)} -\mbox{inf}} \le x \le \mbox{inf} for all x, and {\em (ii) 429 : x + \mbox{inf}} = \mbox{inf for all x. 430 : 431 :

Single Source Shortest Paths

432 :
433 :                 signature SINGLE_SOURCE_SHORTEST_PATHS = sig
434 :                 structure Num : ABELIAN_GROUP_WITH_INF
435 :                 val single_source_shortest_paths :
436 :                 { graph : ('n,'e,'g) graph,
437 :                 weight : 'e edge -> Num.elem,
438 :                 s : node_id
439 :                 } ->
440 :                 { dist : Num.elem array,
441 :                 pred :  node_id array
442 :                 }
443 :                 end
444 :                 functor DijkstraFn(Num : ABELIAN_GROUP_WITH_INF)
445 :                 : SINGLE_SOURCE_SHORTEST_PATHS
446 :
447 : The functor DijkstraFn implements Dijkstra's algorithm 448 : for single source shortest paths. The function \linebreak 449 : single_source_shortest_paths takes as arguments: 450 :
451 :
• a graph G, 452 :
• a weight function on edges, and 453 :
• the source vertex s. 454 :
455 : It returns two arrays dist and pred 456 : indexed by vertices. These arrays have the following 457 : interpretation. Given a vertex v, 458 :
459 :
• dist[v] contains the distance of v from the source s 460 :
• pred[v] contains the predecessor of v in the shortest 461 : path from s to v, or -1 if v=s. 462 :
463 : 464 : Dijkstra's algorithm fails to work on graphs that have 465 : negative edge weights. 466 : To handle negative weights, Bellman-Ford's algorithm can be used. 467 : The exception NegativeCycle is raised if a cycle of 468 : negative total weight is detected. 469 :
470 :                 functor BellmanFordFn(Num : ABELIAN_GROUP_WITH_INF) : sig
471 :                 include SINGLE_SOURCE_SHORTEST_PATHS
472 :                 exception NegativeCycle
473 :                 end
474 :
475 :

All Pairs Shortest Paths

476 :
477 :                 signature ALL_PAIRS_SHORTEST_PATHS = sig
478 :                 structure Num : ABELIAN_GROUP_WITH_INF
479 :                 val all_pairs_shortest_paths :
480 :                 { graph : ('n,'e,'g) graph,
481 :                 weight : 'e edge -> Num.elem
482 :                 } ->
483 :                 { dist : Num.elem Array2.array,
484 :                 pred :  node_id Array2.array
485 :                 }
486 :                 end
487 :                 functor FloydWarshallFn(Num : ABELIAN_GROUP_WITH_INF)
488 :                 : ALL_PAIRS_SHORTEST_PATHS
489 :
490 : The functor FloydWarshallFn implements Floyd-Warshall's algorithm 491 : for all pairs shortest paths. The function 492 : all_pairs_shortest_paths takes as arguments: 493 :
494 :
• G, and 495 :
• a weight function on edges 496 :
497 : It returns two 2-dimensional arrays dist} and \sml{pred 498 : indexed by vertices (u,v). These arrays have the following 499 : interpretation. Given a pair (u,v), 500 :
501 :
• dist[u,v] contains the distance from u to v. 502 : pred[u,v] contains the predecessor of v in the shortest 503 : path from u to v, or -1 if u=v. 504 :
505 : This algorithm runs in time O(|V|^3+|E|). 506 : 507 : An alternative implementation is available that uses Johnson's algorithm, 508 : which works better for sparse graphs: 509 :
510 :                 functor JohnsonFn(Num : ABELIAN_GROUP_WITH_INF)
511 :                 : sig include ALL_PAIRS_SHORTEST_PATHS
512 :                 exception Negative Cycle
513 :                 end
514 :
515 : 516 :

Transitive Closure

517 :
518 :                 signature TRANSITIVE_CLOSURE = sig
519 :                 val acyclic_transitive_closure : {  + : ('e * 'e -> 'e), simple : bool }
520 :                 -> ('n,'e,'g) graph -> unit
521 :                 val acyclic_transitive_closure2 :
522 :                 {  + : 'e * 'e -> 'e,
523 :                 max : 'e * 'e -> 'e
524 :                 }  -> ('n,'e,'g) graph -> unit
525 :                 val transitive_closure : ('e * 'e -> 'e) -> ('n,'e,'g) graph -> unit
526 :                 structure TransitiveClosure : TRANSITIVE_CLOSURE
527 :
528 : Structure TransitiveClosure implements 529 : in-place transitive closures on graphs. Three functions are implemented. 530 : Functions acyclic_transitive_closure and 531 : acyclic_transitive_closure2 can be used 532 : to compute the transitive closure of an acyclic graph, whereas the 533 : function transitive_closure computes the transitive closure of 534 : a cyclic graph. All take a binary function 535 :
536 :                 + : 'e * 'e -> 'e
537 :
538 : defined on edge labels. 539 : Transitive edges are inserted in the following manner: 540 : 541 :
542 :
• acyclic_transitive_closure: 543 : given u \edge{l} v and v \edge{l'} w, 544 : if the flag simple is false or if 545 : the transitive edge u \edge{} w does not exists, 546 : then u \edge{l + l'} w is added to the graph. 547 :
• acyclic_transitive_closure2: 548 : given u \edge{l} v and v \edge{l'} w, 549 : the transitive u \edge{l + l'} w is added to the graph. 550 : Furthermore, all parallel edges 551 : $u \edge{l_1} w, \ldots, u \edge{l_n} w$ 552 : are coalesced into a single edge u \edge{l} w, where 553 : l = {/tt max}_{i = 1\ldots n} l_i 554 :
555 : 556 :

Max Flow

557 : 558 : The function max_flow computes the 559 : maximum flow between the source vertex s and the sink vertex 560 : t} in the graph when given a \sml{capacity function. 561 :
562 :                 signature MAX_FLOW = sig
563 :                 structure Num : ABELIAN_GROUP
564 :                 val max_flow : { graph    : ('n,'e,'g) graph,
565 :                 s        : node_id,
566 :                 t        : node_id,
567 :                 capacity : 'e edge -> Num.elem,
568 :                 flows    : 'e edge * Num.elem -> unit
569 :                 } -> Num.elem
570 :                 end
571 :                 functor MaxFlowFn(Num : ABELIAN_GROUP) : MAX_FLOW
572 :
573 : The function max_flow returns its result in the follow manner: 574 : The function returns the total flow as its result value. 575 : Furthermore, the function flows is called once for each edge e in the 576 : graph with its associated flow f_e. 577 : 578 : This algorithm uses Goldberg's preflow-push approach, and runs 579 : in O(|V|^2|E|) time. 580 :

Min Cut

581 : The function min_cut computes the 582 : minimum (undirected) cut in a graph} when given a \sml{weight function on 583 : its edges. 584 :
585 :                 signature MIN_CUT = sig
586 :                 structure Num : ABELIAN_GROUP
587 :                 val min_cut : { graph    : ('n,'e,'g) graph,
588 :                 weight : 'e edge -> Num.elem
589 :                 } -> node_id list * Num.elem
590 :                 end
591 :                 functor MinCutFn(Num : ABELIAN_GROUP) : MIN_CUT
592 :
593 : The function min_cut returns a list of node ids denoting 594 : one side of the cut C (the other side of the cut is V - C) and 595 : the weight cut. 596 : 597 :

Max Cardinality Matching

598 : 599 :
600 :                 val matching : ('n,'e,'g) graph -> ('e edge * 'a -> 'a) -> 'a -> 'a * int
601 :
602 : 603 : The function BipartiteMatching.matching computes the 604 : maximal cardinality matching of a bipartite graph. As result, 605 : the function iterates over all the matched edges and returns the 606 : number of matched edges. The algorithm runs in time O(|V||E|). 607 : 608 :

Node Partition

609 :
610 :                 signature NODE_PARTITION = sig
611 :                 type 'n node_partition
612 :
613 :                 val node_partition : ('n,'e,'g) graph -> 'n node_partition
614 :                 val !!    : 'n node_partition -> node_id -> 'n node
615 :                 val ==    : 'n node_partition -> node_id * node_id -> bool
616 :                 val union : 'n node_partition -> ('n node * 'n node -> 'n node) ->
617 :                 node_id * node_id -> bool
618 :                 val union': 'n node_partition -> node_id * node_id -> bool
619 :
620 :                 end
621 :
622 : 623 :

Node Priority Queue

624 :
625 :                 signature NODE_PRIORITY_QUEUE = sig
626 :                 type node_priority_queue
627 :
628 :                 exception EmptyPriorityQueue
629 :
630 :                 val create         : (node_id * node_id -> bool) -> node_priority_queue
631 :                 val fromGraph      : (node_id * node_id -> bool) ->
632 :                 ('n,'e,'g) graph -> node_priority_queue
633 :                 val isEmpty        : node_priority_queue -> bool
634 :                 val clear          : node_priority_queue -> unit
635 :                 val min            : node_priority_queue -> node_id
636 :                 val deleteMin      : node_priority_queue -> node_id
637 :                 val decreaseWeight : node_priority_queue * node_id -> unit
638 :                 val insert         : node_priority_queue * node_id -> unit
639 :                 val toList         : node_priority_queue -> node_id list
640 :                 end
641 :
642 : 643 :

Views}\label{sec:views

644 : Simply put, a view is an alternative presentation 645 : of a data structure to a client. A graph, such as the control flow 646 : graph, frequently has to be presented in different ways in a compiler. 647 : For example, when global scheduling is applied on a region 648 : (a subgraph of the CFG), 649 : we want to be able to concentrate on just the region and ignore all 650 : nodes and edges that are not part of the current focus. 651 : All transformations that are applied on the current region view should be 652 : automatically reflected back to the entire CFG as a whole. 653 : Furthermore, we want to be able to freely intermix 654 : graphs and subgraphs of the same type in our program, without having 655 : to introducing sums in our type representations. 656 : 657 : The subgraph_view} view combinator accomplishes this. \sml{Subgraph 658 : takes a list of nodes and produces a graph object which is a view of the 659 : node induced subgraph of the original graph. 660 : All modification to the subgraph are automatically 661 : reflected back to the original graph. From the client point of view, 662 : a graph and a subgraph are entirely indistinguishable, and furthermore, 663 : graphs and subgraphs can be freely mixed together (they are the same 664 : type from ML's point of view.) 665 : 666 : This transparency is obtained by selective method overriding, composition, 667 : and delegation. For example, a generic graph object provides the 668 : following methods for setting and looking up the entries and exits 669 : from a graph. 670 :
671 :                 set_entries  : node_id list -> unit
672 :                 set_exits    : node_id list -> unit
673 :                 entries      : unit -> node_id list
674 :                 exits        : unit -> node_id list
675 :
676 : 677 : For example, a CFG usually has a single entry and a single exit. 678 : These methods allow the client to destinate one node as the 679 : entry and another as 680 : the exit. In the case of subgraph view, these methods are overridden so 681 : that the proper conventions are preserved: 682 : a node in a subgraph is an entry (exit) iff there is an in-edge (out-edge) 683 : from (to) outside the (sub-)graph. 684 : Similarly, the methods entry_edges} and \sml{exit_edges can be used 685 : return the entry and exit edges associated with a node in a subgraph. 686 :
687 :                 entry_edges  : node_id -> 'e edge list
688 :                 exit_edges   : node_id -> 'e edge list
689 :
690 : These methods are initially defined to return [] in a graph and 691 : subsequently overridden in a subgraph. 692 : 693 :

Update Transparency

694 : 695 : Suppose a view G' is created from some base graphs or views. 696 : Update transparency refers to the fact that G' behaves 697 : consistently according to its conventions and semantics when updates 698 : are performed. There are 4 different type of update transparencies: 699 :
700 :
• update opaque A update opaque view disallows updates to both 701 : itself and its base graphs. 702 :
• globally update transparent A globally update transparent 703 : view allows updates to its base graphs but not to itself. Changes 704 : will then be automatically reflected in the view. 705 :
• locally update transparent A locally update transparent 706 : view allows updates to itself but not to its base graphs. 707 : Changes will be automatically reflected to the base graphs. 708 :
• fully update transparent A fully update transparent 709 : view allows updates through its methods or through its base 710 : graphs'. 711 :
712 : 713 :

Structural Views}\label{sec:structural-views

714 : 715 :

Reversal

716 :
717 :                 ReversedGraphView.rev_view : ('n,'e,'g) graph -> ('n,'e,'g) graph
718 :
719 : This combinator takes a graph G and produces a view G^R 720 : which reverses the direction 721 : of all its edges, including entry and exit edges. Thus 722 : the edge i \edge{l} j in G becomes the edge 723 : j \edge{l} i in G^R. This view is fully update transparent. 724 : 725 :

726 :
728 :
729 : This function takes a graph G and produces a view G' 730 : in which no mutator methods can be used. Invoking a mutator 731 : method raises the exception Readonly. 732 : This view is globally update transparent. 733 : 734 :

Snapshot

735 :
736 :                 functor GraphSnapShotFn(GI : GRAPH_IMPLEMENTATION) : GRAPH_SNAPSHOT
737 :                 signature GRAPH_SNAPSHOT = sig
738 :                 val snapshot : ('n,'e,'g) graph ->
739 :                 { picture : ('n,'e,'g) graph, button : unit -> unit }
740 :                 end
741 :
742 : 743 : The function snapshot can be used to keep a cached copy 744 : of a view a.k.a the picture. This cached copy 745 : can be updated locally but the modification will not be reflected back 746 : to the base graph. The function button can be used to 747 : keep the view and the base graph up-to-date. 748 : 749 :

Map

750 :
751 :                 IsomorphicGraphView.map :
752 :                 ('n node -> $$\alpha'$$) -> ('e edge -> $$\beta'$$) -> ('g -> $$\gamma'$$) ->
753 :                 ('n,'e,'g) graph -> ($$\alpha',\beta',\gamma'$$) graph
754 :
755 : The function map} is a generalization of the \sml{map 756 : function on lists. It takes three functions 757 : $$f$$ : 'n node -> $$\alpha'$$}, \sml{$$g$$ : 'e edge -> $$\beta'$$, 758 : $$h$$ : 'g -> $$\gamma$$' and a graph G=(V,L,E,I) as arguments. 759 : It computes the view G'=(V,L',E',I') where 760 : \begin{eqnarray*} 761 : L'(v) & = & f(v,L(v)) \mbox{\ for all v \in V} \\ 762 : E' & = & { i \edge{g(i,j,l)} j | i \edge{l} j \in E } \\ 763 : I' & = & h(I) 764 : \end{eqnarray*} 765 : 766 :

Singleton

767 :
768 :                 SingletonGraphView.singleton_view : ('n,'e,'g) graph -> node_id -> ('n,'e,'g) graph
769 :
770 : Function singleton_view 771 : takes a graph G and a node id v (which must exists in G) 772 : and return an edge-free graph with only one node (v). 773 : This view is opaque. 774 : 775 :

Node id renaming

776 :
777 :                 RenamedGraphView.rename_view : int -> ('n,'e,'g) graph -> ($$\alpha',\beta',\gamma'$$) graph
778 :
779 : The function rename_view takes an integer n and 780 : a graph G and create a fully update transparent 781 : view where all node ids are incremented by n. Formally, 782 : given graph G=(V,E,L,I) it computes the view G'=(V',E',L',I) 783 : where 784 : \begin{eqnarray*} 785 : V' & = & { v + n | v \in V } \\ 786 : E' & = & { i+n \edge{l} j+n | i \edge{l} j \in E } \\ 787 : L' & = & \lambda v. L(v-n) 788 : \end{eqnarray*} 789 : 790 :

Union and Sum

791 :
792 :                 UnionGraphView.union_view : ('g * $$\gamma'$$) -> $$\gamma''$$) ->
793 :                 ('n,'e,'g) graph * ($$\alpha,\beta,\gamma'$$) graph -> ($$\alpha',\beta',\gamma''$$) graph
794 :                 GraphCombinations.unions : ('n,'e,'g) graph list -> ($$\alpha',\beta',\gamma$$) graph
795 :                 GraphCombinations.sum : ('n,'e,'g) graph * ('n,'e,'g) graph -> ($$\alpha',\beta',\gamma$$) graph
796 :                 GraphCombinations.sums : ('n,'e,'g) graph list -> ($$\alpha',\beta',\gamma$$) graph
797 :
798 : 799 : Function union_view takes as arguments 800 : a function f, and two graphs 801 : G=(V,L,E,I) and G'=(V',L',E',I'), it computes the union G+G' of 802 : these graphs. Formally, G \union G'=(V'',L'',E'',I'') where 803 : \begin{eqnarray*} 804 : V'' & = & V \union V' \\ 805 : L'' & = & L \overrides L' \\ 806 : E'' & = & E \union E' \\ 807 : I'' & = & f(I,I') 808 : \end{eqnarray*} 809 : 810 : The function sum} constructs a \define{disjoint sum of two 811 : graphs. 812 :

Simple Graph View

813 :
814 :                 SimpleGraph.simple_graph : (node_id * node_id * 'e list -> 'e) ->
815 :                 ('n,'e,'g) graph -> ('n,'e,'g) graph
816 :
817 : Function simple_graph takes a merge function f 818 : and a multi-graph G as arguments and return a view in which 819 : all parallel multi-edges (edges with the same source and target) are combined 820 : into a single edge: i.e. any collection of multi-edges between 821 : the same source s and target t and with labels l_1,\ldots,l_n, 822 : are replaced by the edge s \edge{l_{st}} t in the view, where 823 : l_{st} = f(s,t,[l_1,\ldots,l_n]). The function f is assumed 824 : to satisfy the equality l = f(s,t,[l]) for all l, s and t. 825 : 826 :

No Entry or No Exit

827 :
828 :                 NoEntryView.no_entry_view : ('n,'e,'g) graph -> ('n,'e,'g) graph
829 :                 NoEntryView.no_exit_view : ('n,'e,'g) graph -> ('n,'e,'g) graph
830 :
831 : 832 : The function no_entry_view creates a view in which 833 : all entry edges (and thus entry nodes) are removed. The function 834 : no_exit_view is the dual of this and creates a view in which 835 : all exit edges are removed. This view is fully update transparent. 836 : It is possible to remove all entry and exit edges by composing these 837 : two functions. 838 : 839 :

Subgraphs

840 :
841 :                 SubgraphView.subgraph_view : node_id list -> ('e edge -> bool) ->
842 :                 ('n,'e,'g) graph -> ($$\alpha',\beta',\gamma'$$) graph
843 :
844 : 845 : The function subgraph_view takes as arguments a set of node ids 846 : S, an edge predicate p and a graph G=(V,L,E,I). It 847 : returns a view in which only the visible nodes are S and 848 : the only visible edges e are those that satisfy p(e) and 849 : with sources and targets in S. S must be a subset of V. 850 : 851 :
852 :                 Subgraph_P_View.subgraph_p_view : node_id list ->
853 :                 (node_id -> bool) -> (node_id * node_id -> bool) ->
854 :                 ('n,'e,'g) graph -> ($$\alpha',\beta',\gamma'$$) graph
855 :
856 : 857 : The function subgraph_view takes as arguments a set of node ids 858 : S, a node predicate p, an edge predicate q and a graph G=(V,L,E,I). It 859 : returns a view in which only the visible nodes v are those 860 : in S satisfying p(v), and 861 : the only visible edges e are those that satisfy q(e) and 862 : with sources and targets in S. S must be a subset of V. 863 : 864 :

Trace

865 :
866 :                 TraceView.trace_view : node_id list -> ('n,'e,'g) graph -> ($$\alpha',\beta',\gamma'$$) graph
867 :
868 : 869 : \begin{wrapfigure}{r}{3in} 870 : \begin{Boxit} 871 : \psfig{figure=trace.eps,width=2.8in} 872 : \end{Boxit} 873 : \caption{\label{fig:trace-view} A trace view} 874 : \end{wrapfigure} 875 : A trace is an acyclic path in a graph. 876 : The function trace_view takes a trace of node ids 877 : v_1,\ldots,v_n and a graph G and 878 : returns a view in which only the nodes are visible. 879 : Only the edges that connected two adjacent nodes on the trace, i.e. 880 : v_i \to v_{i+1} for some i = 1 \ldots n-1 are considered be within 881 : the view. Thus if there is an edge v_i \to v_j in G where 882 : j \ne i+1 this edge is not considered to be within the view --- it 883 : is considered to be an exit edge from v_i and an entry edge 884 : from v_j however. Trace views can be used to construct a CFG region 885 : suitable for trace scheduling \cite{trace-scheduling,bulldog}. 886 : 887 : Figure \ref{fig:trace-view} illustrates this concept graphically. 888 : Here, the trace view is formed from the 889 : nodes A, C, D, F} and \sml{G. The 890 : solid edges linking the trace is visible within the view. All other 891 : dotted edges are considered to be either entry of exit edges into 892 : the trace. The edge from node G} to \sml{A is considered to 893 : be both since it exits from G} and enters into \sml{A. 894 : 895 :

Acyclic Subgraph

896 :
897 :                 AcyclicSubgraphView.acyclic_view :
898 :                 node_id list ->
899 :                 ('n,'e,'g) graph -> ('n,'e,'g) graph
900 :
901 : \begin{wrapfigure}{r}{3in} 902 : \begin{Boxit} 903 : \psfig{figure=subgraph.eps,width=2.8in} 904 : \end{Boxit} 905 : \caption{\label{fig:acyclic-subgraph-view} An acyclic subgraph} 906 : \end{wrapfigure} 907 : The function acyclic_view takes an ordered 908 : list of node ids v_1,\ldots,v_n and a graph G as arguments 909 : and return a view G' such that only the nodes v_1,\ldots,v_n 910 : are visible. In addition, only the edges with directions consistent 911 : with the order list are considered to be within the view. 912 : Thus an edge v_i \to v_j from G is in G' iff 1 \le i < j \le n. 913 : Acyclic views can be used to construct a CFG region suitable 914 : for DAG scheduling. 915 : Figure \ref{fig:acyclic-subgraph-view} illustrates this concept graphically. 916 : 917 :

Start and Stop

918 :
919 :                 StartStopView.start_stop_view :
920 :                 { start : 'n node,
921 :                 stop  : 'n node,
922 :                 edges : 'e edge list
923 :                 } -> ('n,'e,'g) graph -> ($$\alpha',\beta',\gamma'$$) graph
924 :
925 : 926 : The function start_stop_view 927 : 928 :

Single-Entry/Multiple-Exits

929 :
930 :                 SingleEntryMultipleExit.SEME
931 :                 exit : 'n node -> ('n,'e,'g) graph -> ('n,'e,'g) graph
932 :
933 : 934 : The function SEME converts a single-entry/multiple-exits 935 : graph G into a single entry/single exit graph. 936 : It takes an exit node e and a graph G and returns 937 : a view G'. Suppose i \edge{l} j be an exit edge in G. 938 : In view G this edge is replaced by a new normal edge i \edge{l} e 939 : and a new exit edge e \edge{l} j. Thus e becomes the sole exit 940 : node in the new view. 941 : 942 :

Behavioral Views

943 : 944 :

Behavioral Primitives

945 : 946 : Figure \ref{fig:behavioral-view-primitives} lists 947 : the set of behavioral primitives defined 948 : in structure /tt GraphWrappers. 949 : These functions allow the user 950 : to attach an action a to a mutator method m such that whenever m 951 : is invoked so does a. Given a graph G, the combinator 952 :
953 :                 do_before_$$xxx$$ : f -> ('n,'e,'g) graph -> ('n,'e,'g) graph
954 :
955 : \noindent returns a view G' such that whenever method xxx is invoked 956 : in G', the function f is called. 957 : Similarly, the combinator 958 :
959 :                 do_after_$$xxx$$ : f -> ('n,'e,'g) graph -> ('n,'e,'g) graph
960 :
961 : \noindent creates a new view G'' such that the function f 962 : is called after the method is invoked. 963 : \begin{Figure} 964 : \begin{boxit} 965 : \small 966 :
967 :                 do_before_new_id : (unit -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
968 :                 do_after_new_id : (node_id -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
969 :                 do_before_add_node : ('n node -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
970 :                 do_after_add_node : ('n node -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
971 :                 do_before_add_edge : ('e edge -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
972 :                 do_after_add_edge : ('e edge -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
973 :                 do_before_remove_node : (node_id -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
974 :                 do_after_remove_node : (node_id -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
975 :                 do_before_set_in_edges : (node_id * 'e edge list -> unit) ->
976 :                 ('n,'e,'g) graph -> ('n,'e,'g) graph
977 :                 do_after_set_in_edges : (node_id * 'e edge list -> unit) ->
978 :                 ('n,'e,'g) graph -> ('n,'e,'g) graph
979 :                 do_before_set_out_edges : (node_id * 'e edge list -> unit) ->
980 :                 ('n,'e,'g) graph -> ('n,'e,'g) graph
981 :                 do_after_set_out_edges : (node_id * 'e edge list -> unit) ->
982 :                 ('n,'e,'g) graph -> ('n,'e,'g) graph
983 :                 do_before_set_entries : (node_id list -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
984 :                 do_after_set_entries : (node_id list -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
985 :                 do_before_set_exits : (node_id list -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
986 :                 do_after_set_exits : (node_id list -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
987 :
988 : \end{boxit} 989 : \caption{\label{fig:behavioral-view-primitives} Behavioral view primitives} 990 : \end{Figure} 991 : 992 : Frequently it is not necessary to know precisely by which method a graph's 993 : structure has been modified, only that it is. The following two methods 994 : take a notification function f and returns a new view. f is 995 : invoked before a modification is attempted in a view created 996 : by do_before_changed. It is invoked after the modification in 997 : a view created by do_after_changed. 998 :
999 :                 do_before_changed : (('n,'e,'g) graph -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
1000 :                do_after_changed : (('n,'e,'g) graph -> unit) -> ('n,'e,'g) graph -> ('n,'e,'g) graph
1001 :
1002 : 1003 : Behavioral views created by the above functions are all fully update 1004 : transparent. 1005 :
1006 : 1007 : 1008 :