Dynamizing Static Algorithms, with Applications to Dynamic Trees and History Independence Umut A. Acar Guy E. Blelloch Robert Harper Jorge L. Vittes Shan Leung Maverick Woo Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213 {umut,blelloch,rwh,jvittes,maverick}@cs.cmu.edu Abstract We describe a machine model for automatically dynamizing static algorithms and apply it to historyindependent data structures. Static programs expressed in this model are dynamized automatically by keeping track of dependences between code and data in the form of a dynamic dependence graph. To study the performance of such automatically dynamized algorithms we present an analysis technique based on trace stability. As an example of the use of the model, we dynamize the Parallel Tree Contraction Algorithm of Miller and Reif to obtain a history-independent data structure for the dynamic trees problem of Sleator and Tarjan. say that an algorithm is O(f (n))-stable for a class of input changes (e.g., insertions of edges in a forest of size n) if max(I ,I )n (AT (I ), AT (I )) O(f (n)), where n is the relation mapping inputs to changed inputs for instances of size n. We then prove that if an algorithm is O(f (n))-stable, then the input changes can be automatically propagated in O(f (n) log f (n)) time. In a certain special case we show how this can be improved to O(f (n)) time. We apply our approach to the problem of maintaining dynamic trees [24]. We show that a simple variant of the random-mate parallel tree contraction algorithm of Miller and Reif [16] is expected O(log n)-stable for edge insertions and deletions. Since the algorithm falls into the special case mentioned above, it can be automatically dynamized to support updates in expected O(log n) time. The algorithm uses O(log2 n) random bits, and the expectation is over the choices of these bits. The basic algorithms uses O(n log n) space, but we describe a variant which uses O(n) space. The algorithm is significantly simpler than current O(log n) worst-case algorithms [24, 11, 14]. In related work [2], we implemented the data structure, which we call RCTrees (Rake-and-Compress Trees), and applied it to a broad set of applications. Our automatic dynamization technique builds a dynamic dependence graph to represent an execution and uses a change propagation algorithm to update the output and the dependences whenever the input changes [1]. A dynamic dependence graph maintains the relationship between code and data in a way that makes it easy for the change-propagation algorithm to identify code, called a reader, that depends on a changed value. Readers are stored as closures, i.e., functions with environments. This enables re-execution of a reader in the same state as before--modulo of course the changed values. The change-propagation algorithm maintains a queue of readers affected by changes and re-executes them in sequential execution order. When a reader is re-executed, the dependences 1 Intro duction We describe techniques for automatically generating dynamic algorithms from static ones, and the application of these techniques to history-independent data structures [15, 19, 12]. Our approach is based on having a system track dependences as the algorithm executes. When the input is changed, the system propagates changes through the dependence graph rerunning code where necessary. Since the dependences can change during propagation, the dependence graph is itself dynamic. To allow the system to track both control and data dependences, the static algorithm must be written for a variant of a standard RAM model. This model is an extension of a model we have previously considered that was based on purely functional programs [1]. To analyze the performance of automatically dynamized algorithm, we use a notion of trace stability based on traces and a distance between traces. We define a trace AT (I ) for an algorithm A on input I and a distance metric (·, ·) between traces. The trace captures, in a particular way, the operations executed by the algorithm. We This work was supp orted in part by the National Science Foundation under the grant CCR-9706572 and also through the Aladdin Center (www.aladdin.cs.cmu.edu) under grants CCR0085982 and CCR-0122581. Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688 531 that it created in the previous execution are deleted and dependences created during re-execution are added to the dependence graph. It is in this way that the dependence graphs are dynamic. Re-executing a reader may change other data values, whose readers are then inserted into the queue. Change propagation terminates when the reader queue becomes empty. The dynamic dependence graphs maintained by our approach depend only on the current input and not on the modifications that lead to the current input. Dynamic dependence graphs are therefore history independent [15, 19, 12]. For change propagation, however, we maintain a topological sort of the dynamic dependence graph by using the Dietz-Sleator order maintenance data structure [8], which is not history independent. For a class of algorithms we show that the topological sort as well as the dynamic dependence graphs can be maintained so that we obtain strongly history-independent dynamic algorithms [19, 12]. The tree-contraction algorithm falls in this class and therefore yields an efficient dynamic, strongly history-independent data structure for dynamic trees. The data structure is randomized. Generating a history-independent version of existing deterministic dynamic tree algorithms [24, 25, 11, 14] remains an open problem. This paper extends our previous dynamization technique [1] by relaxing one-write per read restriction. The more important contribution, however, is trace stability, its application to the dynamic-trees problem, and history independence. Herschberger [3] use a similar approach to implement Kinetic Data Structures. They maintain certificates that correspond to certain predicates. As kinetic values change, these predicates can also change value, requiring the changes to be propagated through future code that depend on them. Again, predicates, dependences, and the propagation of changes are implemented on a per-problem basis. None of these approaches address how in general changes can be propagated. Several researchers have tried techniques that can be applied automatically to a reasonably broad set of problems. Demers, Reps, and Teitelbaum [6] introduced the idea of maintaining a dependence graph of a computation, so that any change in the input can be propagated through the graph. Their approach, however, can only be applied if the dependence graph is static (oblivious to the input). Pugh and Teitelbaum [22], showed how to use memoization to dynamize static algorithms for a class of divide-and-conquer algorithms. The approach is similar to the C (n)-decomposable framework of Overmars [20], but works automatically. They also develop an analysis technique based on stable decompositions. The notion of stability we define in this paper is motivated by their idea. The class of problems that can be handled by memoization, however, is very different from what can be handled by our approach. Dynamic Trees. Dynamic-tree data structures have found applications in many areas and have been studied extensively [24, 25, 11, 26, 13, 14]. Sleator and Tarjan first considered the dynamic-trees problem and presented a data structures that supports edge insertions and deletions (aka, the link and cut operations), re-rooting, and path queries in both amortized and worst-case O(log n) time, for a forest with n vertices [24, 25]. Cohen and Tamassia extended their technique to maintain dynamic expression trees [5]. Frederickson developed Topology Trees and applied it to various problems [10, 11]. Topology trees provide worst-case O(log n) update times for various dynamic tree operations in bounded degree trees. Holm and Lichtenberg modified Frederickson's data structure to support unbounded degree trees [14]. Henzinger and King [13] and independently Tarjan [26] developed simpler dynamictree data structure based on Euler-tours for a more limited interface supporting subtree queries. History Indep endence. History-independent data structures preserve privacy by making it difficult or impossible to infer information about the past states of a data structure based on its current state. History-independent data structures were first studied by Micciancio, who showed that the shape of a variant of 2-3 trees is independent of the history of opera- 2 Related Work Many researchers have previously considered generating dynamic solutions to problems based on their static solutions. This approach is often called dynamization. Bentley and Saxe [4] dynamized a class of search problems by decomposing them into parts so that an update is applied to one part (statically), and a search is applied to all parts. The approach trades off search-time for update-time. Overmars [20, 21] dynamized a class of C (n)-order decomposable problems by maintaining partial static solutions in a tree such that an update only propagates along a path of the tree. He used this approach, for example, to generate a O(log2 n)-time dynamic algorithm for convex hulls. Mulmuley and independently Schwarzkopf proposed a dynamic shuffling approach that maintains a history of operations and dependences among them [18, 23]. The approach allows insertions/deletions in arbitrary positions back in "history". The effects of the update are then propagated forward in time. History, the dependences, and propagation are implemented and analyzed on a per-problem basis. Basch, Guibas, and 532 tions [15]. Naor and Teague [19] extended Micciancio's n be an class of input changes parameterized by n and model to encompass the memory and defined weak let (·) be a probability-density function on random bit and strong history independence. Strong history strings {0, 1} . Define independence ensures privacy even under unlimited R number of encroachments. Hartline et al [12], simplified d(n) = max (AT (I , R), AT (I , R)) · (R). (I ,I )n Naor and Teague's definitions and showed that strong {0,1} history independence requires canonical representation. Then AT is expected S -stable for n and if d(n) S . 3 Input Changes and Trace Stability We formalize the notions input changes, and trace stability. The stability definition is parameterized over the choice of a trace and the distance metric for traces. In Section 4, we define the particular trace and distance metric for the model we use in this paper. Define a trace generator for an algorithm A, written AT , as a version of the algorithm that outputs a trace and let (·, ·) be a distance metric on traces. For example, the trace can be chosen as the function-call tree of the execution, and the trace distance can be defined as the sum of the costs of function calls that are different in two traces. We study the similarity of traces generated by an algorithm under various classes of input changes. Formally, we define a class of input changes as a relation , whose domain and codomain are both the input i space. If (I , I ) , then the modification of I to I s a valid input change for . To facilitate analysis, we partition by an integer parameter n, and denote the ith partition as i . Typically the parameter represents the size of the input I . For output-sensitive algorithms, can be partitioned according to the output change. Definition 3.1 (Worst-Case Stability) Let AT be a trace generator for an algorithm A and let n be a class of input changes parameterized by n. Define d(n) = (I ,I )n 4 Machine Mo del and Traces We describe a general-purpose machine model for automatically dynamizing static algorithms. The model is Turing-complete and any static program expressed in this model can be dynamized automatically, but the performance of the dynamized version depends on the trace stability of the program. The main results are Theorems 4.1 and 4.2, which bound the time for dynamic-updates in terms of trace stability and show the relationship to strongly history-independent data structures. The machine model is a variant of the RAM model that places certain restriction on reading from and writing to memory. These restrictions are motivated by the need to (1) determine data dependences, (2) determine the scope of the code that needs to be reexecuted when a data value changes, i.e., the control dependences, and (3) re-execute code in the same state as it was previously executed. Note that when code is re-executed it can take, because of conditionals, a different path than it previously did. Machine Mo del. As in the RAM model the machine consists of a random-access memory and a finite set of registers R, where R = r0 , . . . rk . The memory consists of words, which are large enough to hold a pointer. In addition to the registers, a stack pointer and a program counter are also maintained in registers that are invisible to the user. A program is a set of function definitions containing a main function where the execution starts. Functions take no explicit parameters--all values are passed through registers. The body of a function consists of a sequence of instructions, which include the usual instructions for arithmetic (add, mul, etc), branches, a write instruction for writing to memory, and a calln instruction for reading from memory and making function calls. Branches are restricted to jump to a location within the function body where they appear. The instruction write ri , rj stores the value of rj at memory location specified by ri . Memory can be read only during a function call, calln, which takes the number of locations to be read and a function. Execution of calln n, f saves the registers and the return address onto the stack, dereferences the first n registers (i.e., ri M em[ri ] for 0 i n - 1), and max (AT (I ), AT (I )). Then AT is S -stable for n if d(n) S . Note that O(f (n))-stable, (f (n))-stable, and {f (n)}stable are all valid uses of the stability notation. For randomized algorithms, it is important that we use the same random bits on different inputs-- because otherwise, traces will certainly not be stable. We therefore separate the random bits used by the algorithm from the input and denote the trace as AT (I , R), where R is the sequence of random bits. When analyzing stability we assume the same random bits are used on different inputs, and the expectation is taken over all possible choices of the random bits. Definition 3.2 (Exp ected-Case Stability) Let AT be a trace generator for a randomized algorithm A, let 533 whose read-values are the same have the same number of children. We say that a vertex v is a guard if the 0 h 8 read-value of its cognate v is different than the readj7d d values of v . In Figure 1 vertices e and g are guards. For our model, we define the distance between two traces as Figure 1: The traces of a program on two inputs. the sum of the weights of all guards in both traces. For jumps to the first instruction of the function. Return example, the distance between the traces in Figure 1 is from the function restores the registers by popping the (2 + 3) + (4 + 2) = 11. Trace distance can be defined stack and passes the control back to return point. recursively as follows. The motivation for using functions for reading is to identify the code that depends on the value being Definition 4.1 (Trace Distance) The distance beread. We note that since the registers are restored when tween traces T and T , (T , T ), of a program is returning from a function call, the only way to return w (T ) + w(T ) if rv al(T ) = rv al(T ) ) results is through memory. This ensures that all data k (T , T = i dependences are tracked. i=1 (Ti , T ) otherwise, In this paper, we consider valid only those programs i . that write to each memory location at most once (aka, where Ti and T are the ith subtree of T and T single-assignment) and read no memory location before Stability and Automatic Dynamization. We it is written--all purely-functional programs satisfy present our main results for this section. The theorems these restrictions. This restriction is motivated by the rely on results from Section 6. Theorem 4.1 only applies need to re-execute code in the same context as originally in the worst case, whereas Theorem 4.2 applies in both executed. An alternative is to use a persistent data- expected and the worst case. structure to represent the memory [9, 7]. Theorem 4.1 (Stability & Dynamization) Let AT Traces. The trace of a program on some input is an be the trace-generator for an algorithm A. If AT is ordered, rooted tree corresponding to the function-call O(f (n))-stable for a class of input changes , then the tree of the execution of the program on that input. Each output of A can be updated in O(f (n) log f (n)) time for vertex represents a function call (calln), with the root any change from . representing the main function. Edges represent the Proof. Follows directly from Theorem 6.1 by using a caller-callee relationship--there is an edge from u to v logarithmic-time priority queue. F if u calls v . Outgoing edges of a vertex are ordered with or r-round parallel computations, we improve on respect to the execution order. Each vertex is annotated with a weight equal to the running time of the function this bound and also achieve strong history independence call and a list of the values read, called read-values, by as defined by Hartline et al. [12]. We say that an algothat call. The weight of a trace T , written w(T ), is the rithm is r-round paral lel if it operates in r rounds where weight of its root vertex. The read-values of a trace T , a calln in round i only reads memory locations written in the body of a calln from a round j < i. We say that written rv al(T ), is the read-values of its root. Figure 1 shows two traces of some program. Each a dynamic algorithm is strongly history independent if vertex is labeled with a letter and its read-value. The after any number of input changes and output updates weight of a vertex v is the number of vertices in the the memory layout of the implementation depends only subtree rooted at v --for example, the weight of b is five on the current input I . on the left and six on the right. Theorem 4.2 (R-Round-Parallel) Let AT be the Given a trace T , we call the path from the parent trace-generator for an r-round paral lel algorithm A. If of a vertex v to the root of T as the cal l path of v ; the AT is (expected) O(f (n))-stable for some class of incall path of the root is empty. When comparing two put changes , then the output of A can be updated in traces of some program, we will deem certain vertices (expected) O(f (n) + r) time for any change from . equivalent--throughout this paper, we only compare Furthermore, if each location is read by at most a contraces belonging to the same program. Given traces stant times, then there is a strongly history independent a T and T , we say that vertices v of T and v of T implementation of the resulting dynamic algorithm. re cognates if they have the same call path relative to Proof. History independence follows directly from Thethe root of their traces, including the read-values. For orem 6.2. The worst case time bound follows directly example, in Figure 1 the vertex a on the left and a on from Corollary 6.1. The expected bound follows from the right are cognates, and so are the b, c, d, e, g 's. Since 5 Corollary 6.1 by simple algebra. programs in our model are deterministic, two cognates 3 c 4e 8 f9 7 b g 5 8i a 1 a 1 3 c 4e 4 7k 9f 1 b g 7 i 34 tree contract MR (F ) { tree contract { while (#edg es(F ) > 0) for i = 1 to k log n do for each v v ertices(F ) for j = 1 to n do contract(v ) } // r0 &A[i, j ].liv e; r1 &A[i, j ].deg ree contract(v ) { // r2 &A[i, j ].ng h[0] . . . rt+2 &A[i, j ].ng h[t] if (v .deg ree = 1) // rake calln (t + 3), contract } u = neighbor(v ) contract { if (u.deg ree > 1 or u > v ) if (not liv e) delete v write &A[i + 1, j ].liv e,false else if (v .deg ree = 2) // compress else if (deg ree = 1) (u1, u2) = neighbors(v ) rake ... if (u1.deg ree > 1 and u2.deg ree > 1 and else if (deg ree = 2) flips(u1, v , u2)=(T , H, T )) compress ... connect u1 and u2 else delete v } find degree & singleton status, copy ... } Figure 2: Randomized Tree Contraction (left) and Tree Contraction in our model (right) 5 Tree Contraction and Dynamic Trees number of rounds in the expected case. Using tree contraction the programmer can perform various computations on trees by associating data with edges and vertices and defining how data is accumulated during rake and compress [16, 17]. In a related paper [2], we describe how various properties of trees can be maintained dynamically using RC-Trees. We consider a broad set of applications including path queries, subtree queries, centers/medians/diameters of trees, shortest distance to a set of marked nodes, and least common ancestor queries [2]. We also describe how to separate handling of application-specific data and queries from structural changes (edge insertions and deletions). We therefore consider only structural changes in this paper. Implementation. Figure 2 shows the pseudo code for an implementation of tree contraction in our model. Since the model requires that a location be written at most once, we copy the forest in each round. For a t-ary forest with n vertices, we keep the vertices in an array A with n × (k log n + 1) cells (k is a constant to be specified in Theorem 5.2). The input forest is stored in the first row, and tree contraction is performed for k log n rounds--the ith round reads the ith row and writes to (i + 1)st row. Each cell of the array represents an instance of a vertex and consists of a live flag, a degree field, a singleton-status flag, and neighbor pointers. The singleton status of a vertex is a binary value indicating if that vertex is a leaf or not. Each round calls the function contract on all vertices. The call to contract reads the liveness flag, the degree, and the neighbor pointers for that vertex into registers. Rake and compress operations are implemented We dynamize the parallel tree-contraction algorithm of Miller and Reif [16] to obtain a data structure for dynamic-trees [24, 25, 11, 14], which we call RC (Rakeand-Compress) Trees. We first describe parallel tree contraction, and then show that it is expected O(log n)stable with respect to edge insertions and deletions. Theorem 4.2 implies an expected O(log n) time, historyindependent data structure for dynamic trees. 5.1 Parallel Tree Contraction Given a t-ary forest, the parallel tree-contraction algorithm operates in rounds. In each round, each tree is contracted by applying the rake and compress operations. The rake operations delete all leaves of the tree (if the tree is a single edge, then only one leaf is deleted). The compress operations delete an independent set of degree-two vertices. All rake and compress operations within a round are local and are applied "in parallel" (i.e., all decisions are based on the state when the round starts). The algorithm terminates when no edges remain. Various versions of tree contraction have been proposed depending on how they select the independent set of degree-two vertices to compress. There are two basic approaches, one deterministic and the other randomized. We use a randomized version--each vertex flips a coin in each round and a degree-two vertex is compressed if it flips a head, its two neighbors both flip tails, and neither neighbor is a leaf. This compress rule is a slight generalization of the original rule by Miller and Reif [16], which only applies to rooted trees. Figure 2 shows pseudo-code for randomized tree contraction. It is not hard to show that the algorithm takes logarithmic 535 as with the original algorithm except that we make another calln in order to read the singleton status of each neighbor. If a vertex is live and is not deleted in this round, we compute its degree based on the singleton status of its neighbors, compute its singleton-status, and copy it to the next round. The trace for tree-contraction is a tree with height three. The root has nk log n children, each of which corresponds to a call to contract and is either a leaf or has one child. Each depth-three vertex corresponds to a call for reading the singleton status of neighbors. Each vertex except for the root has a constant weight. To generate random coin flips, we use a family H of 3-wise independent hash functions mapping {1 . . . n} to {0,1}. We randomly select k log n members of H , one per round. Since there are families H with |H | polynomial in n [27], we need O(log2 n) random bits. As discussed in Section 3 we analyze trace stability assuming fixed selection of random bits and take expectations over all selections of bits. For fixed random bits, vertex i will generate the same coin flip on round j for any input. This ensures that the trace remains stable in different executions. After k log n rounds, it is possible, though with small probability, that not all edges are deleted. When that happens, we can use any linear-time tree evaluation algorithm to compute the result. This implementation is k log n-round parallel as defined in Section 4, and every location is read at most (t + 1) times, therefore Theorem 4.2 applies. The implementation as described requires O(n log n) memory. Since with high probability only O(n) locations will contain live vertices [16] we can use a hash table of size O(n) to implement the memory in our model. Furthermore we don't need to store the dependences between non-live locations since they all go from A[i, j ] to A[i + 1, j ]. Therefore the algorithm can be implemented in O(n) space--it is not clear, however, if this implementation would be strongly history independent. We define the configuration of a vertex v at round i 1 { (u, (u)) | (v , u) E i } v V i i F (v ) = deleted v Vi Here (u) is the singleton status of vertex u (whether u has degree one or not). Configuration of a vertex represents the values read by the call to contract. We define a contraction XF of F to be k 1 1 2 XF = F (v1 ) . . . F (vn ) F (v1 ) . . . F log n (vn ). A contraction serves as an abstraction of the trace. Consider some forest G = (V , E \ {(u, v )}) obtained from F by deleting the edge (u, v ) and let XG be its contraction. Let TF and TG be traces for the contractions XF and XG . The distance between TF and TG is within a factor of two of the difference between XF and XG , that k , n log n i i neq(F (vj ), G (vj )) is (TF , TG ) = O i=1 j =1 where neq(·, ·) is one if the configurations are not equal and zero otherwise. Thus to bound the trace distance it suffices to count the places where two configurations do not match in XF and XG . We say that a vertex v becomes affected in round i, i i+ if i is the earliest round for which F+1 (v ) = G 1 (v ). Once a vertex becomes affected, it remains affected in any subsequent round. Only input changes will make a vertex affected at round one. We say that a vertex v is a frontier at round i, if v is affected and is adjacent to a unaffected vertex at round i. We denote the set of affected vertices at round i as Ai --note that Ai includes all affected vertices live or deleted. For any A Ai , we i denote the forests induced by A on F i and Gi as FA and i GA respectively. Since all deletedkvertices have the same . configuration (TF , TG ) = O log n i=1 i |FAi | + |Gi i | A An affected component at round i, AC i is defined as i a maximal set satisfying (1) AC i Ai , and (2) FAC i i and GAC i are trees. Analysis. To prove that tree contraction is expected O(log n) stable we will show that the expected i size of FAi and Gi i is constant at any round i. The A 5.2 Trace Stability of Parallel Tree Contraction first lemma establishes two useful properties that we We analyze the trace stability of tree contraction under use throughout the analysis. a single edge deletion or insertion. Data changes and queries can be handled separately with support trees [2]. Lemma 5.1 (1) A frontier is live and is adjacent to the Definitions. Consider a forest F = (V , E ) with n vertices and execute tree-contraction on F . Order vertices in the order that they are processed in each round V = (v1 , . . . , vn ). In what follows, the term "at round i" means, "at the beginning of round i". We denote the contracted forest at round i as F i = (V i , E i ). A vertex v is live at round i, if v V i --it has not been deleted (compressed or raked) in some previous round. same set of unaffected vertices at any round i in both XF and XG . (2) If a vertex becomes affected in any round i, then it is either adjacent to a frontier or it has neighbor that is adjacent to a frontier at that round. Proof. The first property follows from the facts that (A) a frontier is adjacent to a unaffected vertex at that round, and (B) unaffected vertices have the same configuration in both XF and XG . For the second 536 least one. Furthermore x cannot have degree one, because otherwise its leaf status would be different in two contraction making its neighbor affected. Thus x has degree two or greater. Suppose that x is compressed in round i in XF or XG . Then x has at most two unaffected neighbors y and z , which may become affected. Since x is compressed, y will have degree at least one after the compress, and thus no (unaffected) neighbor of y will become affected. e now partition Ai into two affected components Same argument holds for z . Now suppose that x is AC i and AC i , Ai = AC i AC i , such that Gi i = Gi C i not deleted in either X or X . If x does not become u v u v A Au F G i i Gi C i and FAi = FAC i AC i . Affected components a leaf, then no vertices become affected. If x becomes Av u v correspond to the end-points of the deleted edge (u, v ). a leaf, then it has at most one unaffected neighbor, which may become affected. Thus, at most two vertices Note Gi C i and Gi C i are disconnected. Au Av L become affected or frontier. 1 1 Lemma 5.2 At round one, each of AC u and AC v emma 5.5 Suppose there are exactly two frontiers in contain at most two vertices and at most one frontier. AC i . At most two vertices become affected in round i Proof. Deletion of (u, v ) makes u and v affected. If i rt u does not become a leaf, then its neighbors remain due to the contraction of vei+ices in AC and there are 1 ( unaffected. If u becomes a leaf, then its neighbor u at most two frontiers in AC . if it exists) becomes affected. Since u and v are not Proof. Let x and y be two frontiers. Since x and y are connected in G, the set {u} or if u is also affected in the same affected component, they are connected {u, u } is an affected component, this set will be AC 1 . by a path of affected vertices and each is also adjacent u Either u or if u exists, then u may be a frontier. The to a unaffected vertex. Thus each has degree at least two in both contractions at round i. Assume that x L set AC 1 is built by a similar argument. v is compressed in either contraction. Then x has at most one unaffected neighbor w, which may become emma 5.3 Assume that AC i and AC i are the only affected. Since w has degree at least one after the v u affected components in round i. There are exactly two compress, no (unaffected) neighbor of w will become affected components in round i + 1, AC i+1 and AC i+1 . affected. If x is not deleted in either contraction, then v u Proof. By Lemma 5.1, we consider vertices that become no vertices will become affected, because x cannot affected due to each frontier. Let U and V be the set of become a leaf--a path to y remains, because y cannot vertices that become affected due to a frontier in AC i be raked. Therefore, at most one vertex will become u and AC i respectively and define AC i+1 = AC i U affected and will possibly become a frontier. The same u u v L and AC i+1 = AC i V . This accounts for all affected argument applies to y . v v vertices Ai+1 = Ai U V . By Lemma 5.1 the vertices of U and V are connected to some frontier by emma 5.6 The total number of affected vertices in F i a path. Since tree-contraction preserves connectivity, and Gi is O(1) in the expected case. i+ i+ FAC1+1 , FAC1+1 and Gi+1i+1 , Gi+1i+1 are trees, and Proof. Consider applying tree contraction on F i and i i AC v AC u v u i i+1 i+1 note that FAi will contract by a constant factor when GACB 1 , GAC i+1 remain disconnected. i+ u v we disregard the frontier vertices, by an argument y induction on i, using Lemmas 5.2 and 5.3, the similar to that of Miller and Reif [16]--based on innumber of affected components is exactly two. We now dependence of randomness between different rounds. show that each affected component has at most two The number of affected components is exactly two frontiers. The argument applies to both components, and each component has at most two frontiers and thus we consider some affected component AC . cause two vertices become affected (by Lemmas 5.2, 5.4, and 5.5). Thus, there are at most four fronLemma 5.4 Suppose there is exactly one frontier in tiers and four new vertices may become affected in AC i . At most two vertices become affected in round i round i. Thus, we have E [|F i+1 |] (1 - c)|F i | + 8 Ai Ai+1 due to contraction of vertices in AC i and there are at i+1 i and E [|FAi+1 |] (1 - c)E [|FAi |] + 8. most two frontiers in AC i+1 . Since the number of affected vertices in the first i Proof. Let x be the sole frontier at round i. Since round is at most 4, E [|FA |] = O(1), for any i. A similar x is adjacent to a unaffected vertex, it has degree a5 argument holds for Gi . t property, consider some vertex v that is unaffected at round i. If v 's neighbors are all unaffected, then v 's neighbors at round i + 1 will be the same in both XF and XG . Thus, if v becomes affected in some round, then either it is adjacent to an affected vertex u, or v has neighbor u whose singleton status differs in XF and XG in round (i + 1), which can happen only if u is adjaWnt to an affected vertex. ce 37 Theorem 5.1 Tree contraction is expected O(log n) dependence graph along with memory M and labels V stable for a single edge insertion or deletion. constitutes all the state needed by change propagation. Time stamps maintain a topological ordering of the Proof. By Lemma 5.6, the expected number of live trace (V , E ). During change propagation, functions afaffected vertices is constant in every round. Since tree-contraction takes k log n rounds, the result follows fected by the changes are re-executed in this topological order. Since both control and data dependences go T by the linearity of sums of expectations. from earlier to later function calls, propagating changes heorem 5.2 Tree contraction algorithm yields a in topological order is critical for correctness. Since dystrongly history-independent, expected O(log n)-time namic dependence graphs change during change propadata structure for the dynamic trees problem for inser- gation, order of time-stamps must be maintained dynamically. For this, we use the Dietz-Sleator ordertions and deletions of edges. maintenance algorithm [8], which supports constant Proof. The tree contraction algorithm is k log n-round- time insertion, deletion, and comparison operations. parallel as defined in Section 4. Since each location is read at most a constant number of times, the algorithm Building a Dynamic Dep endence Graph. A is strongly history independent and takes expected dynamic dependence graph is built by execution and O(log n) time by Theorems 4.2 and 5.1. kept up to date by change propagation. In addition to Since a constant fraction of all vertices are deleted the set of location L and the memory M , we keep track in each round with a nonzero constant probability, of the set of changed locations LC ; LC will be used by the probability that the contraction is not complete the change propagation algorithm. We assume that a (the forest contains an edge) after k log n rounds set of memory locations are given as inputs I . Initially is at most 1/n for some constant k [16]. There- L = I and M maps each input location to a value. Each fore running a linear time tree evaluation algorithm to write l, v instruction sets M (l) to v . If l L, then L 6 finish off the contraction takes O(1) expected time. is extended with l (L = L {l}); otherwise, l is added to the set of changed locations (LC = LC {l}). Since programs in our model are single-assignment, a location Automatic Dynamization This section presents a constructive proof of Theorems will never be added to LC in the initial execution. The executed calln instructions build the trace 4.1 and 4.2 by using dynamic dependence graphs and and the dependence edges. Initially V , E , D are all change propagation. set to the empty set and the current time-stamp is Dynamic Dep endence Graphs. A dynamic de- initialized to the "beginning of time". Execution of pendence graph is built by the execution of a program a calln n, f instruction (1) extends V with a new in order to represent the data and control dependences vertex v (V = V {v }), (2) inserts a trace edge from in that execution. It contains two kinds of edges, trace the caller u to v (E = E {(u, v )}), (3) inserts a edges, which correspond to control dependences, and dependence edge from locations being read LR to v data-dependence edges. Each vertex of a dynamic de- (D = D {(l, v ) | l LR }) (4) advances the time by pendence graph is also tagged with a time stamp that creating a new time-stamp t after the current time and corresponds to the execution time of that vertex and setting the current time to t, (5) extends V by mapping also other information to facilitate change propagation. v to the current time, the value of the registers, the Formally, a dynamic dependence graph DDG = function being called, and the number of locations read, ((V , L), (E , D)) consists of vertices partitioned into (6) dereferences the registers and makes the call. trace vertices V and locations L, and a set of edges partitioned into trace edges E V × V and dependences Change Propagation. The pseudo-code for D L × V . The graph (V , E ), which we call a trace, is change propagation is shown in Figure 3. The algoa function call tree and is structurally equivalent to the rithm takes as input a set of input changes C , the trace defined in Section 4--unlike the original definition, memory M , and a dynamic dependence graph DDG. however, we do not require that the tree be ordered. The input changes C maps a set of locations to new The dependences D corresponds to data dependences values. The algorithm updates the values of changed between locations and function calls. The function V locations in M (line 2), and builds a priority queue of maps each vertex in V to a quadruple consisting of (1) affected vertices reading the changed locations (line 3). a time-stamp, (2) the function being called, (3) the val- A vertex is called affected if the value of a location that ues of registers immediately before the function call, it reads has changed since the last execution. and (4) the number of reads. The function M , called Each iteration of the while loop re-executes a guard the memory maps each location to a value. A dynamic vertex v . First the subtree T - for the previous execu- 538 propagate (M , C, DDG as ((V , L), (E , D))) 1 T = (V , E ) 2 M = (M \ {(l, M (l)) | l C }) C 3 Q = {v | (l, v ) C, (l, v ) D} 4 while Q = 5 v = deleteMin(Q) 6 T - = subtree rooted at v 7 (n, f un, time, R) = V (v ) 8 ((V + , E + ), D+ , LC ) = calln n, f un at time with regs R and Memory M 9 T + = (V + , E + ) 10 T = replace T - with T + in T 11 D- = {(l, u) | u vertices(T - ) (l, u) D} 12 D = D D+ \ D- 13 A = {u | l LC (l, u) D} 14 Q = Q A \ vertices(T - ) 15 u vertices(T - ), delete time-stamp(u) Figure 3: The change-propagation algorithm. tion of the call is determined. Then the function of v is re-executed starting at the same time and with the same registers as it was previously executed (line 8). Let T + = (V + , E + ) denote the trace and D+ denote the data-dependences for re-execution, and LC be the set of locations changed during re-execution. The trace is updated by replacing the subtree T - rooted at v with T + (line 10) and dependences are updated by adding D+ to and deleting D- from D (line 12)--D- is the dependences pertaining to T - . The queue is updated by removing the vertices of T - and inserting the newly affected vertices A (line 14). Finally the time-stamps of the vertices of T - are deleted. Re-executing function calls in topologically sorted order according to their time stamps serves two purposes. First, it ensures that function calls that are not relevant to the computation are not taken. For example, in Figure 1, change propagation that starts with the trace on the left and yields the trace on the right should not re-execute the call h; indeed h is removed from the queue when g is re-executed (line 14). Second, it ensures that a value read by a calln is up-to-date, because such a value may depend on a previous call. Although the sequential execution order gives a topological sort of a dynamic dependence graph, other forms of topological ordering can be more efficient. For example, in an r-round-parallel computation, all edges are from earlier to later rounds. Thus we can use round numbers as time stamps and use array-based, constanttime priority queues. tree representation and by maintaining for each memory location a list of pointers to vertices reading that location and back. For change propagation, we obtain different bounds depending on the priority queue we employ. Our main lemma is therefore parameterized over the time for priority queue operations. Theorem 6.1 Consider the traces T (I ) and T (I ) of a program on inputs I and I and let d = (T (I ), T (I )). Change propagation takes time O(d + d · p(d)) with a priority queue that supports insertion, deletion, and delete-minimum operations in O(p(·)) time. Proof. Consider an iteration of the change propagation loop and let v be the vertex removed from the queue. It is a property of the algorithm v is a guard with cognate v . Let w and w denote weights of v and v respectively. Determining T - takes constant time (line 6). Re-executing the function call takes time w (line 8). Updating the trace takes constant time (line 10). Since each function call performs a constant number of reads, computing D- takes O(w) time. Since D+ has O(w ) dependences, updating the dependences takes in O(w + w ) time (line 12). Deleting the timestamps for vertices of T - takes time O(w). Thus, an iteration of the loop takes time O(w + w ) time plus time for finding the affected vertices (line 13) and priority queue operations (lines 5 and 14). Since re-execution of a vertex removes its descendants from the queue (line 14), only guards are reexecuted. Since the trace-distance is the total weights of the guards, the total time is O(d) plus time for finding the affected vertices and priority queue operations. Let m be the total number of affected vertices. Finding all affected vertices (line 13) takes time O(m). Since each vertex is added to and removed from the priority queue once, the time for all priority queue operations is O(m · p(m)). Since the weight of a vertex is no less than the number of its descendants, m d. Therefore the total time is O(d + dp(d)). F or r-round parallel computations we use round numbers as time stamps and implement priority queues with r buckets supporting all operations in O(1) time. Corollary 6.1 Let T (I ) and T (I ) be the traces of an r-round paral lel program on inputs I and I and d = (T (I ), T (I )). Change propagation takes O(d + r) time. Theorem 6.2 Consider an r-round-paral lel algorithm and suppose the algorithm reads each memory location some constant number of times on al l inputs. The Implementation and Performance. We imple- algorithm can then be dynamized to yield a strongly ment dynamic dependence graphs by using a standard history-independent dynamic algorithm. 539 Proof. Hartline et al [12] show that a data structure is strongly history independent if and only if it has a canonical memory representation--every input has a unique layout in memory. Thus, to show that an automatically dynamized algorithm is strongly history independent, we show that dynamic dependence graphs have a canonical representation. The structure of the dynamic dependence graph is clearly canonical, but the actual memory layout might not be. Since the number of reads of each location is constant, we can store the information associated with any vertex v (calln) in memory associated with the first location that it reads (we inline calls with no reads so that they don't have a vertices in the trace). To maintain the children of v , we will have a list that runs through the children themselves. Since a location is read constant times, the dependence edges and the vertices can be stored in sorted order based on, for example, the bit representation. Finally, the structure of the order maintenance algorithm we rely on [8] is not canonical. For r-round parallel computations, however, A we can use the round numbers as time stamps. cknowledgments We thank Bob Tarjan and Renato Werneck for many discussions and feedback. References [1] U. A. Acar, G. E. Blelloch, and R. Harper. Adaptive functional programming. In Proceedings of the 29th Annual ACM Symposium on Principles of Programming Languages, pages 247­259, 2002. [2] U. A. Acar, G. E. Blelloch, and J. L. Vittes. Separating structure from data in dynamic trees. Technical Report CMU-CS-03-189, Department of Computer Science, Carnegie Mellon University, 2003. [3] J. Basch, L. J. Guibas, and J. Hershberger. Data structures for mobile data. Journal of Algorithms, 31(1):1­28, 1999. [4] J. L. Bentley and J. B. Saxe. Decomposable searching problems I: Static-to-dynamic transformation. Journal of Algorithms, 1(4):301­358, 1980. [5] R. F. Cohen and R. Tamassia. Dynamic expression trees and their applications. In Proceedings of the 2nd Annual ACM-SIAM Symposium on Discrete Algorithms, pages 52­61, 1991. [6] A. Demers, T. Reps, and T. Teitelbaum. Incremental evaluation of attribute grammars with application to syntax directed editors. In Proceedings of the 8th Annual ACM Symposium on Principles of Programming Languages, pages 105­116, 1981. [7] P. F. Dietz. Fully persistent arrays. In Workshop on Algorithms and Data Structures, volume 382 of Lecture Notes in Computer Science, pages 67­74. SpringerVerlag, August 1989. [8] P. F. Dietz and D. D. Sleator. Two algorithms for maintaining order in a list. In Proceedings of the 19th ACM Symposium on Theory of Computing, pages 365­ 372, 1987. [9] J. R. Driscoll, N. Sarnak, D. D. Sleator, and R. E. Tarjan. Making data structures persistent. Journal of Computer and System Sciences, 38(1):86­124, Feb. 1989. [10] G. N. Frederickson. Data structures for on-line updating of minimum spanning trees, with applications. SIAM Journal on Computing, 14:781­798, 1985. [11] G. N. Frederickson. A data structure for dynamically maintaining rooted trees. Journal of Algorithms, 24(1):37­65, 1997. [12] J. D. Hartline, E. S. Hong, A. E. Modht, W. R. Pentney, and E. C. Rocke. Characterizing history independent data structures. In Proceedings of Thirteenth International Symposium on Algorithms and Computation (ISAAC), volume 2518 of Lecture Notes in Computer Science. Springer, 2002. [13] M. R. Henzinger and V. King. Randomized fully dynamic graph algorithms with polylogarithmic time per operation. Journal of the ACM, 46(4):502­516, 1999. [14] J. Holm and K. de Lichtenberg. Top-trees and dynamic graph algorithms. Technical Report DIKU-TR-98/17, Department of Computer Science, University of Copenhagen, Aug. 1998. [15] D. Micciancio. Oblivious data structures: applications to cryptography. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing, pages 456­ 464, 1997. [16] G. L. Miller and J. H. Reif. Parallel tree contraction and its application. In Proceedings of the 26th Annual IEEE Symposium on Foundations of Computer Science, pages 487­489, 1985. [17] G. L. Miller and J. H. Reif. Parallel tree contraction, part 2: Further applications. SIAM Journal on Computing, 20(6):1128­1147, 1991. [18] K. Mulmuley. Randomized multidimensional search trees: Lazy balancing and dynamic shuffling (extended abstract). In Proceedings of the 32nd Annual IEEE Symposium on Foundations of Computer Science, pages 180­196, 1991. [19] M. Naor and V. Teague. Anti-presistence: history independent data structures. In Proceedings of the thirtythird annual ACM symposium on Theory of computing, pages 492­501. ACM Press, 2001. [20] M. H. Overmars. Dynamization of order decomposable set problems. Journal of Algorithms, 2:245­260, 1981. [21] M. H. Overmars. The Design of Dynamic Data Structures. Springer, 1983. [22] W. Pugh and T. Teitelbaum. Incremental computation via function caching. In Proceedings of the 16th Annual ACM Symposium on Principles of Programming Languages, pages 315­328, 1989. [23] O. Schwarzkopf. Dynamic maintenance of geometric structures made easy. In Proceedings of the 32nd Annual IEEE Symposium on Foundations of Computer Science, pages 197­206, 1991. [24] D. D. Sleator and R. E. Tarjan. A data structure for dynamic trees. Journal of Computer and System Sciences, 26(3):362­391, 1983. [25] D. D. Sleator and R. E. Tarjan. Self-adjusting binary search trees. Journal of the ACM, 32(3):652­686, 1985. [26] R. E. Tarjan. Dynamic trees as search trees via euler tours, applied to the network simplex algorithm. Mathematical Programming, 78:167­177, 1997. [27] M. N. Wegman and L. Carter. New classes and applications of hash functions. In Proceedings of the 20th Annual IEEE Symposium on Foundations of Computer Science, pages 175­182, 1979. 540