Finding Dominators Revisited Extended Abstract Loukas Georgiadis1 Abstract Robert E. Tarjan1,2 Tarjan [13] gave an O(m(m, n))-time algorithm, where The problem of finding dominators in a flowgraph arises in (m, n) is an extremely slow-growing functional inverse many kinds of global code optimization and other settings. of the Ackermann function. A complicated linear-time In 1979 Lengauer and Tarjan gave an almost-linear-time algorithm was announced by Harel [11], but the descripalgorithm to find dominators. In 1985 Harel claimed a linear- tion is incomplete and contains some false arguments time algorithm, but this algorithm was incomplete; Alstrup (see [3]). Based on Harel's work, Alstrup et al. [3] gave a et al. [1999] gave a complete and "simpler" linear-time complete linear-time algorithm for the RAM mo del, but algorithm on a random-access machine. In 1998, Buchsbaum using complicated data structures, in particular Fredet al. claimed a "new, simpler" linear-time algorithm with man and Willard's Q-heaps [8]. Buchsbaum et al. [6] implementations both on a random access machine and on a attempted to give the first simple linear-time dominapointer machine. In this paper, we begin by noting that the tors algorithm by applying a new special-case analykey lemma of Buchsbaum et al. does not in fact apply to sis of the path compression technique used in disjoint their algorithm, and their algorithm does not run in linear set union and other problems [16]. We show, however, time. Then we provide a complete, correct, simpler linear- that their analysis do es not apply to their dominators time dominators algorithm. One key result is a linear-time algorithm which means that despite being more comreduction of the dominators problem to a nearest common plicated, the Buchsbaum et al. algorithm is not asympancestors problem, implementable on either a random-access totically faster than the Lengauer-Tarjan algorithm. We give an overview of the Buchsbaum et al. algorithm in machine or a pointer machine. Section 3 together with a discussion of the reason it is not linear. Our new algorithm is based on the Buchs1 Intro duction baum et al. algorithm and can be implemented either We consider a flowgraph G = (V , A, r), which is a on a random-access or on a pointer machine. directed graph with |V | = n vertices and |A| = m arcs such that every vertex is reachable from a distinguished 2 Preliminaries start vertex r V . A vertex w dominates a vertex v if every path from r to v includes w. Our problem 2.1 Partitioning the DFS tree Let D be a depthis to find for each vertex v in V the set dom(v ) which first search (DFS) tree [15] of the control flow graph consists of the vertices that dominate v . The immediate G = (V , A, r). We assume that G is represented by dominator of v , denoted by idom(v ), is the unique adjacency lists. The DFS tree D is rooted at r and vertex w that dominates v and is dominated by all the parent(v ) is the parent of vertex v in D. For simplicity vertices in dom(v ) - {w}. The immediate dominators we refer to the vertices by their DFS numbers, so v < u tree is a directed tree I rooted at r which is formed means that the DFS number of v is less than the DFS by the edges {(idom(w), w) | w V - {r}}. A vertex number of u. As in [7] we partition D into trivial and v dominates w if and only if v is a proper ancestor of nontrivial microtrees. Each nontrivial microtree is a w in I [2]. Thus it suffices to compute idom(v ) for all maximal subtree of D that has at most g vertices and contains at least one leaf of D, where g is a parameter vertices. The dominators problem appears in program op- that we will fix appropriately. Each trivial microtree is timization and code generation [2, 4]. Lengauer and a singleton internal vertex of D. For any vertex v we denote by micro(v ) the microtree that contains v . Also we denote by M the set of nontrivial microtrees. 1 Department of Computer Science, Princeton University, The trivial microtrees constitute a tree with relaPrinceton NJ, 08540. E-mail: {lgeorgia,ret}@cs.princeton.edu. tively few leaves, the core C of the depth-first search Research at Princeton University partially supported by the Natree. Let be any microtree of D and let root( ) be tional Science Foundation, under the Aladin Pro ject, Grant No its root. If root( ) = r then v = parent(root( )) is a CCR-0122581, Carnegie Mellon. 2 Hewlett-Packard, Palo Alto CA. Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688 869 1 3 2 1 16 into a single vertex we get a tree C with L vertices. Let l and q be two vertices in C that correspond to the lines (v1 , . . . , vi ) and (u1 , . . . , uj ). Then l is the parent of q in C if and only if parent(u1 ) = vi . In this case we say l = parentC (q). 27 3 30 4 17 20 2.2 Semidominators and External Dominators In this section we review some of the definitions and F 8 19 8 29 1 5 results that are given in [13] and [7]. A path P = 2 2 8 (u = v0 , v1 , . . . , vk-1 , vk = v ) is a semidominator 4 15 path (SDOM path) if vi > v for 1 i k - 1. The semidominator of vertex v is sdom(v ) = 24 1 3 2 9 min({u | there is an SDOM path from u to v }). From [13] we know that for any vertex w = r, idom(w) 52 1 0 + + sdom(w) and sdom(w) w1 . Moreover, if v w then either v idom(w) or idom(w) idom(v ). 6 2 1 12 The following result [13] allows efficient computation of semidominators and computation of immediate domigure 1: Partition of the DFS tree. There are 10 inators from semidominators. nontrivial microtrees and 3 lines, which are encircled. The nontrivial microtrees have size at most 3 and are Lemma 2.1. Let w = r and let u be a vertex for which sdom(u) is minimum among vertices u satisfying shown with dark vertices. + sdom(w) u w. Then sdom(u) sdom(w) and idom(u) = idom(w). Moreover, if sdom(u) = sdom(w) vertex in C . Also any vertex of the core has at least g then idom(w) = sdom(w). proper descendants in D. We note that v is a leaf of the Let micro(v ) be the microtree (either trivial or nontree C if and only if all of its children in D are roots trivial) that contains v . A path P = (u = of nontrivial microtrees. Therefore C has at most n/g v0 , v1 , . . . , vk-1 , vk = v ) is an external dominator path leaves. We can group the vertices of C into a set L of (XDOM path) if P is an SDOM path and vi micro(v ) / paths with the following properties: if v1 , v2 , . . . , vk is for 0 i k - 1. The external dominator of vertex v is a path in L then vi+1 is the unique child of vi in C for 1 i k - 1, and vk has either zero or more than one xdom(v ) = min({v } children in C . We call such a path a line. We denote {u | there is an XDOM path from u to v }). by line(v ) the line that contains vertex v . Let L be the number of lines. It is easy to verify that L is bounded by twice the number of leaves in C i.e., L 2n/g . Figure A path P = (u = v0 , v1 , . . . , vk-1 , vk = v ) is a pushed external dominator path (PXDOM path) if vi 1 shows an example of this decomposition. Let l = (v1 = s, v2 , . . . , vk-1 , vk = t) be a line in root(micro(v )) for 1 i k - 1. The pushed external L. The subtree Tt rooted at t is a union of microtrees dominator of vertex v is in M and a part of the core. For example in Figure pxdom(v ) = 2.1, T1 is D and T9 = {9, 10, 11, 12}. For each vertex i, 1 i k - 1, we define a right subtree Ti and a left min({u | there is a PXDOM path from u to v }). subtree Ti of Tvi . Ti contains vi and all descendants of vi that are greater than vi+1 and are not descendants of An XDOM path is also an SDOM path and an vi+1 ; Ti contains vi and all descendants of vi that are SDOM path is also a PXDOM path, thus pxdom(v ) less than vi+1 . The trees Ti , Ti and Tvi+1 partition the sdom(v ) xdom(v ). If micro(u) = {u}, which is true if proper descendants of vi . All the children of vi , except u is in C , then pxdom(u) = xdom(u) = sdom(u). The for vi+1 , are roots of nontriviat microtrees. We also next result from [7] allows computation of immediate l k-1 T define T (l) = i=1 o be the set of vertices dominators from pushed external dominators. i - {vi } t k-1 Ti in the right subtrees of l and T (l) = i=1 - {vi } 1 Throughout this pap er the notation "v u" means that v T + o be the set of vertices in the left subtrees of l. Now is an ancestor of u in the tree T . "v T u" means that v is a consider the core tree C ; if we contract every line in L proper ancestor of u in T . We omit the subscript T when T = D. 7 6 1 13 21 870 Lemma 2.2. For any v , there exists a w micro(v ) such that (1) (2) (3) (4) w v; pxdom(v ) = pxdom(w); pxdom(w) = sdom(w); idom(w) micro(v ). / micro(v ). We start by constructing from G( ) an augmented graph aug ( ) as follows. We add a vertex t and for each arc (u, v ) such that v and u we add / the arc (t, v ). Let iidom(v ) be the internal immediate dominator of a vertex v , which is defined as the immediate dominator of v in aug ( ). From [6, 7] we have: Moreover, if idom(v ) micro(v ) then idom(v ) = / Lemma 3.1. Let v be any vertex of a microtree . Then idom(w). idom(v ) is in if and only if iidom(v ) = t. 3 The Buchsbaum et al. algorithm 3.1 Overview of the algorithm The Buchsbaum et al. algorithm preprocesses each nontrivial microtree and finds the immediate dominators for the vertices whose immediate dominators are inside the microtree. Then it computes pxdom(v ) for each vertex v in the graph and uses Lemma 2.2 to find idom(v ) for each v such that idom(v ) micro(v ). The pxdom's are / computed by processing the microtrees in reverse preorder. Let NCA(D, u, v ) be the nearest common ancestor of u and v in the DFS tree D. Also let E N (v ) = {u | (u, v ) A, u micro(v )} be the set of external / neighbors of v . We associate each vertex v V with a value label(v ) which is initially set equal to v . After processing micro(v ) we will have label(v ) = pxdom(v ). This is accomplished by executing the following procedure for each microtree . for v do a = parent(root( )) for u E N (v ) do if micro(u) is trivial then b = u else b = parent(root(micro(u))) endif if (b > a) then l ) + x = min( abel(z ) | NCA(D, a, b) z b else x = endif label(v ) = min({label(u), label(v ), x}) done /* at this point label(v ) = xdom(v ) / done P ushLabels( ) If b < a then b is a proper ancestor of a and it is assumed that label(b) = b. When b > a then all the + vertices z = v that satisfy NCA(D, a, b) z b have been processed before and the corresponding pxdom's are known, i.e., label(z ) = pxdom(z ). Procedure P ushLabels( ) completes the computation of pxdom(v ) for all v after all the xdom's are known. This is accomplished in linear time by finding the strongly connected components of the graph G( ) induced by the vertices of and and processing them in topological order (see [7] for the details). In order to compute the immediate dominators from the pxdom's we need to know whether idom(v ) We can compute the iidom's using any simple (but superlinear) dominators algorithm. Since there may be too many microtrees in D we can't afford to run the simple algorithm for each microtree individually. The way to get around this problem is to use memoization to avoid repeating the same computations for identical graphs. In the worst case we will compute the iidom's for every possible aug ( ) of size O(g 2 ) and use table lookups to retrieve the iidom values in constant time. If g = O(log1/3 n) this computation can be done in linear time. Although this process seems to require a RAM it can in fact be implemented on a pointer machine. This is accomplished in [6] by sorting the graph encodings using a variation of radix sort. After the iidom's are available we need to compute the immediate dominators of the every vertex v that satisfies iidom(v ) micro(v ). / By Lemma 2.2 all we need to do is find the vertex + y on the path pxdom(v ) a such that pxdom(y ) = + min({pxdom(z ) | pxdom(v ) z a}). If pxdom(y ) = pxdom(v ) then idom(v ) = pxdom(v ). Otherwise y is a relative dominator of v , i.e., idom(v ) = idom(y ). For vertices with relative dominators we complete the calculation of immediate dominators in a preorder pass. 3.2 The problem in the analysis The Buchsbaum et al. algorithm uses the same link-eval structure as the Lengauer-Tarjan algorithm (Section 5 of [16]) to implement the computation of x and y above (taking minima on tree paths), but these computations involve only nodes in the core. In fact the two algorithms are essentially the same when the size bound on the nontrivial microtrees is g = 1. In [6, 7] it is concluded that the operations of the link-eval data structure take linear time because the links are performed bottom-up. That is we link v to parent(v ) only after all the vertices in the subtree rooted at v have been linked. Let T1 and T2 be the trees in the link-eval forest that contain parent(v ) and v respectively. It is straightforward to verify that T2 always contains at least one leaf in C and T1 is either a singleton or also contains at least one leaf in C . The analysis assumes that all the non-singleton trees in the link-eval forest have roots which are leaves in C . In that case we only need 871 to link two leaves or a single vertex to a leaf, which implies O(m(m, L) + n) = O(m + n) running time. The problem with this argument is that Tarjan's data structure [16] is not sufficiently flexible, in the sense that we are not free to choose the roots of the trees in the link-eval forest suitably so that the disjoint set union result of [6] would apply. For example consider the case where D is just a path (v1 = r, v2 , . . . , vn ). Then C = (v1 , v2 , . . . , vk ), where k = n - g = O(n). Each successive link is of vi+1 to vi , and vi becomes the root of the tree that contains vj , i j k . Suppose we have pxdom(vk ) = v1 . Then a careful look at the implementation of the link-eval structure in [16, 13] reveals that vk will only participate in the first link and all the other links will involve internal vertices of C , while the analysis in [6, 7] assumes that every vertex will eventually be linked to vk . What we really get is the Lengauer-Tarjan algorithm applied on a graph with O(n) vertices and O(m) arcs, so the running time is not linear. We note that the special-case analysis in [6, 7] for disjoint set union does apply to their offline nearest common ancestors algorithm. In fact, the pointer machine version of our algorithm requires this result in order to achieve linear running time. ent states that the vertices go through during the course of the algorithm. Initially none of the vertices in D is marked. In Phase A we start visiting the vertices in reverse preorder and compute the pxdom's of the vertices that haven't been processed yet. We use the mark bit to keep track of the vertices in the core whose pxdom's have already been computed. For the vertices in M we use the mark bit for a different purpose that we describe later. When we find a vertex v in the core that is not marked we process all the vertices in line(v ) and mark them. When we visit an already marked vertex we proceed to the next vertex in reverse preorder. This means that if l = (v1 = s, v2 , . . . , vk-1 , vk = t) is a line in L, t will be the first vertex of l that we visit. After visiting t we will process all the vertices in l bottom-up. Note that when we start processing l all the vertices in T (l) and Tt and none of the vertices in T (l) have been processed. To process a nontrivial microtree we can use the procedure of Section 3.1, but we use a new data structure to compute x in the inner loop. The second phase is similar to the corresponding part of Buchsbaum et al. algorithm [6, 7]. We start by finding the internal immediate dominators in the microtrees of M. Then we process all the vertices in reverse preorder and locate the immediate dominators 4 Our linear-time algorithm for the vertices in C and for the vertices v M that have immediate dominators outside micro(v ). Again we 4.1 Overview of our approach The conclusion from the previous section is that we need a different way use our new data structure to evaluate minima on tree to evaluate minima on paths inside a line. In particular paths. we show that we can use nearest common ancestor queries on a fixed tree instead of using the standard data 4.2 Range Minimum Queries Our first algorithm structure of [16] for evaluating minima inside a single will use as a subroutine a solution to the offline Range line. We note that our definition of lines is similar to Minimum Query (RMQ) problem. In an instance of this a a the I-paths of [3], but we perform different operations problem we are given an array A = 1 a2 . . . an on the lines. To get a linear time bound the nearest nd a set QA of queries. A query is a pair of array common ancestor queries have to be offline. We can indices (i, j ), such that 1 i j n. For each pair n assure this if we deviate from the usual reverse preorder (i, j ) QA we want to fia d the index of a ; smallest element in the subarray i ai+1 . . . aj that is processing of the vertices when it is necessary to do so. Our results require neither complicated data structures RMQ(A, i, j ) = arg mink[i...j] ak is the answer to query nor a linear-time disjoint set union algorithm, but they (i, j ). In [5] it is shown that the RMQ problem can be do require a linear-time solution to the offline nearest reduced in linear time to the offline Nearest Common common ancestors problem as well as the memoization Ancestor (NCA) problem and vice versa. An instance for small subproblems proposed by Buchsbaum et al. of the NCA problem consists of a tree T and a set QT [7, 6]. For clarity we present an algorithm that consists of queries. A query in QT is a pair (u, v ) where u and of two phases. As in [13] and [7] the two phases can be v are vertices of T . If we refer to the vertices of T by integrated into one, which gives a more efficient solution their preorder numbers then, although the asymptotic running time is the same. In NCA(T , u, v ) = Phase A we compute pushed external dominators for each vertex and in Phase B we find the immediate max({w | w T such that w u and w v }). dominators. In Section 5 we show how to remove an intermediate preprocessing step to get a more efficient Such queries can be answered offline on a pointer algorithm. machine in O(|T | + |QT |)-time [6]. Also there are We will use a mark bit to distinguish among differ- solutions to the offline NCA problem on a random- 872 access machine that answer each query in constant time using O(|T |) time for preprocessing [12, 14, 5]. We note that the linear-time pointer machine algorithm requires QT to be known a priori, while the RAM algorithms don't have this restriction. For our algorithm we want to be able to as swer range minimum , queries for the array n dom(v1 ) . . . sdom(vk ) where l = (v1 , v2 , . . . , vk ) is a line in L. This problem can be solved offline if the semidominators of the line have been computed before we need to answer any query. The order of the pxdom calculations ensures that this is the case. In later sections we use the notation RMQ(vi , vj ) = arg vj V Link (v ) performs a virtual link of v and u = parent(v ). When both v and u belong to the same line l = (v1 , . . . , vk ) then we only have to set lineLabel(v1) = u, if sdom(u) < sdom(lineLabel(v1)). Our algorithm will process all the vertices in l before any of them is linked to its parent, therefore V Link () maintains the correct result for lineLabel. In the case where v = v1 we also need to link v1 with u1 , where u1 is the top vertex of line(u). This is accomplished by executing the real link Link (v1 , u1 ). The implementation of V E v al() assumes that for each vertex v C we store a value up(v ) = arg min u line(v) and uv min vi vj sdom(vj sdom(u). ). In Section 5 we show how to use nearest common These values can be calculated in a straightforward ancestor calculations on a fixed tree directly; that is, manner by a top-down pass over the line, after all the we remove the intermediate step that reduces RMQ to sdom's in the line are known. We further assume that NCA. the lines whose semidominators have been computed are preprocessed so that all the necessary RMQ's are 4.3 A data structure for Link () and E v al() in answered in linear time. The role of V E v al() is to C In this section we implement a data structure that return a vertex with minimum sdom among the vertices can perform links and evaluations of path minima on + a forest F induced by the vertices of C . We start by z that satisfy rootF (b) z b, where rootF (b) is the implementing a data structure that performs the same root of the virtual tree built by V Link () that contains operations on a forest F induced by the vertices of C . b. If both rootF (b) and b are in the same line then Initially each node in C is a singleton tree in F . For the value that we need is RMQ(c, b), where c is such any vertex l C let rootF (l) be the root of the tree that parentD (c) = rootF (b). Otherwise the required value is the vertex with minimum sdom among up(b), the contains l in F . The operations are: lineLabel(E v al(parentC (b ))) and lineLabel(c ), where t Link (l, q ): Link the tree rooted at l in F o the tree b is the top vertex of line(b) and c is the top vertex of rooted at q in F . In our algorithm we will line(c). Since C has L = O(n/g ) vertices and we call always have q = parentC (l). V E v al() for less than m arcs, the total time that will be spent on Link () and E v al() is O(m(m, L) + L) = 1/3 E v al(l): If l = rootF (l) return l. Otherwise, return O(m), for g = O(log n). This implies that n calls to a vertex of minimum value among the ver- V Link () and O(n + m) calls to V E v al() take O(n + m) + tices q that satisfy rootF (l) C q C l. time. We implement these operations using the data structure of Section 5 in [16], but our algorithm will not call Link () and E v al(). Instead it will call two procedures V Link () (Virtual Link ) and V E v al (Virtual E v al) which are built on top of Link () and E v al() and operate on C . A line l = (v1 , v2 , . . . , vk ) is represented in F by its top vertex v1 . We associate with l a label which, after all the vertices in l are linked, will be equal to a vertex with minimum semidominator among the vertices of l. To that end we use a field lineLabel that is only defined for the top vertices of the lines and is initialized to . After linking vj to vj-1 we will have lineLabel(v1) = arg ul min sdom(u). and uvj 4.4 Pushed external dominators of the nontrivial microtrees We process a nontrivial microtree as in Section 3.1, but we use the data structure of the previous section to maintain the link-eval forest in C . A vertex v C can be linked to parent(v ) only after line(v ) has been processed. Specifically v is linked when it becomes the next vertex in reverse preorder and is not processed again because it was marked after processing line(v ). Let c be the last vertex of the core that was linked to its parent. Also suppose v is the next vertex to be processed. Figure 2 shows the various situations that may happen when we examine the arc (u, v ) for u E N (v ). Here b is the nearest ancestor of u in C and similarly a is the nearest ancestor of v in C . Also t is the bottom vertex of l = line(a). Since is 873 c v ine(a) = line(b) = rootF(b) a av u c b vb c ase (a) ase (b1) v in left microtree u F a = rootF(b) l l = root (b) F a b v u b case (b2) v in right microtree or Tt v ootF(b) c r ine(a) ineva) ( c ase (c) ine(b) c c ) = rootF(ba c b v v ui u i i s t u u c t s = vj s b c t ase (i) v vi ase (ii) v i T ase (iii) v j= b F c T' j u j= b i u b s t v s t u j t s c ase (vi) ase (iv) case (v) igure 3: Computing the semidominators of the line l = (v1 = s, v2 , . . . , vk = t) ootF(b) v c l ine(b) b ul a l c ine(b) u l b nontrivial, a = parent(root( )). Then v is in T (l) if and only if parent(c) = a. This implies a = rootF (b) when line(a) = line(b) (case (a) of Figure 2) or b Tt + (case (c) of Figure 2). Also when a < b but not a b ( then v T l) if and only if parent(c) = NCA(D, a, b) / (case (d2) in Figure 2). Case (b) is handled separately without a call to V E v al() because it doesn't require evaluations of path minima and moreover b may have already been processed so we must be careful not to set sdom(v ) = sdom(b) when b = u. 4.5 Semidominators in the core In this section we give an algorithm P rocessLine(t) that computes the semidominators of a line l = (s = v1 , v2 , . . . , vk-1 , vk = t) in L. The algorithm assumes that we have label(u) = pxdom(u) for all vertices u > t, and label(u) = u for any u t. In particular we don't know the pxdom's in T (l). The computation of the semidominators relies on finding strongly connected components on l as we walk upwards from t to s. At vertex vi we examine the incoming arcs (u, vi ). The actions that we need to execute depend on the location of u. Again b is the nearest ancestor of u in C . We consider the following cases (see Figure 3): ase (d1) v in left microtree ase (d2) v in right microtree or Tt r igure 2: Four possible locations of u E N (v ). Any of the four cases may apply when v is in T (l), but only (b) or (d) may apply when v is in T (l) or in Tt M. + In all cases we may have u = b. In case (a), a b and + a, b l. In case (b), b a. In case (c), a b and + b Tt . Finally, in case (d) we have neither a b nor b a. Cases (a), (c) and (d) are handled in V E v al() and case (b) is handled separately. 874 (i). b vi . We update min({label(vi), label(u)}). (ii). (iii). (iv). (v). = w P . Since we don't actually remove the vertices of P we stop the backwards walk once we have reached a vertex z such that it is either marked or in the core. u = vj for j > i. We contract the vertices of Note that if z has been marked then every vertex in the + the path vi vj into a single vertex vi and set path root( ) z has also been marked. We can prove by induction that P rocessLine() correctly computes label(vi) to be the minimum label in this path. semidominators in lines. + u Tt . We find x = min({label(w) | t w b)}). Theorem 4.1. Let l = (v1 = s, v2 , . . . , vk-1 , vk = + Then we contract the path vi t as t) in L. Assume that just before the execution of if there were an arc (t, vi ) and update P rocessLine(t) we have label(u) = pxdom(u) for any label(vi) as done in (ii). Finally we update vertex u such that u > t. Then after the execution label(vi) = min({x, label(vi ), label(u)}). of P rocessLine(t) we have label(vi) = sdom(vi ) for 1 i k. u Tj . It must be that j i. We handle this Proofs and implementation details are omitted in this + + case by contracting the cycle (vj u, vi vj ) extended abstract. The x values that we use in this + into a single vertex vi . The vi vj part is done procedure correspond to path minima and are computed ( + as in (ii). For the vj u part we start from u by V E v al(). After a vertex u in T l) is marked, every and walk backwards until we reach vj or a marked subsequent visit to u will cost us O(1) time. Moreover, vertex. For each vertex w = vj that we visit in the number of stack operations is O(k ). Therefore, it is L this walk, we mark w and for each arc (z , w) we not hard to verify that the execution of P rocess( ine() takes linear time with respect to the size of l T l) and add (z , vi ) at the end of the list of incoming arcs ( of vi . (The added arcs will eventually be examined the number of predecessors of the vertices in l T l), plus the time spent on V E v al(). A similar statement while processing vi .) holds for the procedure of Section 4.4 which processes u Tj . We update label(vi) = nontrivial microtrees. Since the total time time spent min({label(u), label(vi)}). If vj > vi we con- on V Link () and V E v al() is linear the whole algorithm runs in linear time. + tract the path vi vj as in (ii). label(vi) + + + 5 An NCA-based algorithm (vi). Neither vi b nor b vi . Then vi < b. + Our first algorithm uses RMQ to evaluate minima on We find x = min({label(w) | NCA(D, u, vi ) paths inside a line. We have already mentioned that in w b}). Then we update label(vi) = order to solve the offline RMQ for an array A we can min({x, label(vi ), label(u)}). use a linear-time reduction to the offline NCA problem + The process that we use to contract a path vi vj for a tree TA that is constructed based on the values of is similar to Gabow's linear-time algorithm for strong A. This seems to be a redundant step, which we wish components [9]. A stack S is sufficient to keep track to avoid. In this section we modify our algorithm so of the contracted paths in l. Each element of the that it directly uses NCA queries on a tree I (l) that is stack corresponds to a strongly connected component constructed based on the immediate dominators tree of scc of successive vertices in l and can be represented a line l. This method is more desirable also because in by the highest vertex v in the component i.e., v = contrast to TA the immediate dominators tree has to be min({u | u scc}). Then it is also true that sdom(v ) = constructed by the dominators algorithm anyway. min({sdom(u) | u scc}). Suppose that when we reach vertex v the current status of the stack is S = (ud , ud-1 , . . . , u1 ), where v = parent(ud ) and ui < ui-1 for 2 i d. Then we keep removing the top element of the stack until we reach a proper descendant of u or empty the stack. As we pop each element w < u in S we update label(v ) = min({label(v ), label(w)}). The other subroutine that we need carries out the backwards walk from a vertex u in a left nontrivial microtree to root( ). To achieve the effect of contracting the path P = (root( ) u) we set the mark bit of w for all 5.1 Line dominators tree The next lemma suggests a method to construct the immediate dominators tree I once we have computed the semidominators. (We omit the proof in this extended abstract.) We note that this process resembles the Aho, Hopcroft and Ullman algorithm for finding dominators in reducible graphs [1]. Lemma 5.1. For any vertex w = r, idom(w) is the nearest common ancestor of sdom(w) and parent(w) in the immediate dominators tree I , i.e., idom(w) = NCA(I , parent(w), sdom(w)). 875 We can locate NCA(I , parent(w), sdom(w)) easily by using the fact that when idom(w) = sdom(w), parent(w) is not dominated by any vertex v that sat+ isfies idom(w) v sdom(w). If we start from parent(w) and go up on the I -tree path from parent(w) to r until we meet the first vertex x such that x sdom(w), then x = idom(w). In the simple case where D is just a path (1, 2, . . . , n) we have idom(w) = NCA(I , w - 1, sdom(w)). So after the calculation of the semidominators we can construct the dominators tree I incrementally in linear time. For a general graph, however, this procedure can take O(n2 ) time. Now we apply this method to a line l = (v1 , v2 , . . . , vk ) in L and construct a tree I (l), which we call line dominators tree of l. We denote by parentI (l) (v ) the parent of v in I (l). The construction goes as follows. Initially I (l) consists of the vertices v1 and root = sdom(v1 ) where sdom(v1 ) = parentI (l) (v1 ) and is the current root of the tree. When we process vi we first check whether sdom(vi ) < root. If this is true we set parentI (l) (root) = sdom(vi ) and make sdom(vi ) the new root of I (l). Otherwise we traverse the path in I (l) from vi-1 to root until we get to a vertex x such that x sdom(w). Then we set parentI (l) (root) = x. An example is shown in Figure 4. Note that unless v1 = r, I (l) contains vertices outside + l and in particular root v1 . I (l) can be constructed in the same top-down pass of l in which the up values are computed and requires O(k ) time. The next Lemma describes the properties of I (l). Lemma 5.2. Let I (l) be the line dominators tree of l = (v1 , . . . , vk ). For any vi l, if parentI (l) (vi ) l then parentI (l) (vi ) = idom(vi ). Otherwise, let z be a vertex that satisfies parentI (l) (vi ) z v1 and sdom(z ) = min({sdom(w) | parentI (l) (vi ) w v1 }). If sdom(z ) parentI (l) (vi ) then idom(vi ) = parentI (l) (vi ); otherwise idom(vi ) = idom(z ). We also note that the previous lemma implies that all the vertices that point to the same vertex w l have the / same dominators. So it suffices to locate the immediate dominator of u = childI (l) (w) and set idom(v ) = idom(u) for all v l such that parentI (l) (v ) = w. 5.2 Pro cessing the nontrivial microtrees We process a vertex in a nontrivial microtree as in Section 4.4, but we evaluate paths using procedure M V E v al() which is a modified version of V E v al(). The difference between these two version is that in case (a) of Figure 2, M V E v al() uses the result of NCA(I (l), a, b) instead of RMQ(c, a). Also in cases (c) and (d), M V E v al() returns the minimum label itself of any vertex z that + satisfies rootF (b) z b instead of a vertex with the + + + + v vu v 1 [w] u ( v v 4 v 6 8 v 2 [v1 ] v 4 [w] v 3 [v2 ] v 1 5 v 7 v 2 v5 [v4 ] v6 [v3 ] v7 [v6 ] 8 [v2 ] a) u (b) 3 w Figure 4: (a) l = (v1 , . . . , v8 ) is a line of the core. w and u are semidominators of v4 and v1 respectively that lie outside l. The dashed arcs connect sdom(vi ) = parent(vi ) to v and the values inside the brackets correspond to semidominators. (b) The line dominators tree I (l). A solid arc (vi , z ) indicates that parentI (l) (vi ) = z . A dashed arc (z , vi ) indicates that childI (l) (z ) = vi , which means that during the process of constructing I (l), z was once the root of the tree and vi was the first child of z in I (l). 876 minimum label in the same path, as done in V E v al(). The consequence of replacing the range minimum query by a nearest common ancestors calculation is that may get label(v ) = pxdom(v ), for v T (l). Nevertheless, the labels we compute using our modified procedure give sufficient information to find the external dominators of vertices outside l T (l) T (l) and the immediate dominators of vertices in T (l). This fact is stated in the next theorem. Theorem 5.1. Let be a microtree in M and a = parent(root( )). After processing al l the vertices v we have min({label(v ), label(up(a))}) = min({pxdom(v ), label(up(a))}). Moreover, if is not a left microtree then label(v ) = pxdom(v ). Otherwise, if idom(v ) then at least one / of the fol lowing statements is true: (a) label(v ) = pxdom(v ); (b) label(v ) = idom(v ); (c) label(v ) = w , where w is such that idom(v ) = / + + idom(z ) and z satisfies w z v1 and sdom(z ) = + + min({sdom(u) | w u v1 }). Theorem 5.1 implies that the algorithm of Section 4.5 will still be correct, so for any v C we get label(v ) = sdom(v ). Since our goal is to eliminate the need for range minimum queries we need a way to compute idom(v ) when pxdom(v ) line(a). The next result, which is derived from Lemma 2.2, Lemma 5.1, and Lemma 5.2 gives such an alternative solution. Lemma 5.3. Let l = (v1 , v2 , . . . , vk ) be a line at the core of the DFS tree and v = vi be any vertex in either Ti or Ti . If idom(v ) is not in micro(v ) then idom(v ) = NCA(I , vi , pxdom(v )). Moreover, let y = NCA(I (l), vi , pxdom(v )). Then one of the fol lowing statements is true: (a) y = idom(v ) l; (b) y l and either idom(v ) = y or idom(v ) = idom(z ) / + + where z is such that y z v1 and sdom(z ) = + + min({sdom(u) | y u v1 }). 5.3 Immediate dominators In order to complete the description of our modified algorithm we now give the details of computing the immediate dominators. Suppose that u is the parent of the last vertex that was linked. We process the vertices v C such that parentI (line(v)) (v ) = u, and the vertices v such that M, idom(v ) and NCA(I (line(a)), a, label(v )) = / u, where a = parent(root( )). First we consider the case v C . If u line(v ) then idom(v ) = u by Lemma 5.2. Otherwise we compute a vertex z with minimum + + label among the vertices that satisfy u z v1 . This is done using the appropriate call to V E v al(). We note that since u min({sdom(z ) | v1 z v }) we get the same result if we evaluate the minimum label of the + path u v . In this case we know that line(v ) = line(u) therefore V E v al() will not make a range minimum query. By applying Lemma 5.2 we conclude that the result is either the immediate dominator of v or a relative dominator of v . If v belongs to a nontrivial microtree then we assume we have first computed u = NCA(I (line(a)), a, label(v )). Again this calculation is done on a fixed tree so we can use a linear-time offline NCA algorithm. Now we consider the cases listed in Theorem 5.1. If label(v ) = pxdom(v ) then we apply Lemma 5.3 and process v the same way we processed the vertices of the core. In the other cases we have NCA(I (line(a)), a, label(v )) = label(v ) and we can apply Theorem 5.1. Therefore in any case we have either idom(v ) = u or we need an additional call to V E v al() in order to compute the immediate or a relative dominator of v . References [1] A. V. Aho, J. E. Hopcroft and J. D. Ullman. On finding lowest common ancestors in trees. SIAM J. Comput. 5, 1, 1976, Pages 115-133, 2000. [2] A. V. Aho, R. Sethi and J. D. Ullman. Compilers: Principles, techniques and tools. Addison-Wesley, Reading, Mass., 1986. [3] S. Alstrup, D. Harel, P. W. Lauridsen and M. Thorup: Dominators in Linear Time. SIAM J. Comput. 28(6), Pages 2117-2132, 1999. [4] A. Appel. Modern compiler implementation in ML. Cambridge University Press, Cambridge, Cambridge UK, 1998. [5] M. A. Bender and M. Farach-Colton. The LCA problem revisited. LATIN, Pages 88-94, 2000. [6] A. L. Buchsbaum, H. Kaplan, A. Rogers and J. R. Westbrook. Linear-Time Pointer-Machine Algorithms for Least Common Ancestors, MST Verification, and Dominators. STOC 1998, Pages 279-288. [7] A. L. Buchsbaum, H. Kaplan, A. Rogers and J. R. Westbrook. A New, Simpler Linear-Time Dominators Algorithm. ACM Transactions on Programming Languages and Systems 20, 6, November 1998, Pages 12651296. [8] M. L. Fredman and D. E. Willard. Trans-dichotomous algorithms for minimum spanning trees and shortest paths. Journal of Computer and System Sciences, 48(3), Pages 533­551, 1994. 877 [9] H. N. Gabow. Path-based depth-first search for strong and biconnected components. Information Processing Letters 74, 2000, Pages 107-114. [10] H. N. Gabow and R. E. Tarjan. A Linear-Time Algorithm for a Special Case of Disjoint Set Union. JCSS 30(2), Pages 209-221, 1985. [11] D. Harel. A linear time algorithm for finding dominators in flow graphs and related problems. In Proceedings of the Seventeenth Annual ACM Symposium on Theory of Computing, Pages 185­194, May 1985. [12] D. Harel and R. E. Tarjan. Fast Algorithms for Finding Nearest Common Ancestors. SIAM J. Comput. 13(2), Pages 338-355, 1984. [13] T. Lengauer and R. E. Tarjan, A fast algorithm for finding dominators in a flowgraph. ACM Trans. Program. Lang. Syst. 1, 1, 1979, Pages 121-141. [14] B. Schieber and U. Vishkin. On Finding Lowest Common Ancestors: Simplification and Parallelization. SIAM J. Comput. 17(6), Pages 1253-1262, 1988. [15] R. E. Tarjan, Depth-first search and linear graph algorithms. SIAM J. Comput., 1,(2), Pages 146-160, 1972. [16] R. E. Tarjan, Applications of path compression on balanced trees. J. ACM 26, 4, 1979, Pages 690-715. 878