Variable Length Path Coupling
Thomas P. Hayes
Abstract
We present a new technique for constructing and analyzing couplings to bound the convergence rate of finite Markov chains. Our main theorem is a generalization of the path coupling theorem of Bubley and Dyer, allowing the defining partial couplings to have length determined by a random stopping time. Unlike the original path coupling theorem, our version can produce multi-step (non-Markovian) couplings. Using our variable length path coupling theorem, we improve the upper bound on the mixing time of the Glauber dynamics for randomly sampling colorings.

Eric Vigoda
one-step (Markovian) couplings [11, 14, 13]. Moreover, it enables the partial couplings to be variable-length, which may be more natural in some settings. There are several previous works which use a multi-step coupling, see [5, 13], or analyze a single-step coupling over many steps, see [8, 9]. Our main theorem is a general technique which, in many cases, will simplify and improve applications of multi-step couplings and analyses of single-step couplings over many steps. We use our new technique to analyze a simple Markov chain for randomly sampling k -colorings of a graph on n vertices with maximum degree . Roughly speaking, we can prove the chain has mixing time O(n log n) when k  1.953. This improves a result of Dyer et al. [8] which holds when k  (2 - ) where = 8 × 10-7 . Variable Length Path Coupling Consider a finite ergodic Markov chain with state space , transition matrix P and stationary distribution  . A coupling is a joint stochastic process (Xt , Yt ) on  ×  such that each process viewed individually, in isolation from the other chain, evolves according to P . However, the transitions of the two processes can be highly correlated; this is wherein the power of the method lies. The probability of coalescence, i. e., of reaching the same state, from an arbitrary pair of initial states, bounds the distance from stationarity. This is easy to prove by choosing the initial state for one of the chains from the stationary distribution. Typically, the probability of coalescence is analyzed by defining a metric  :  ×   {0, 1, 2, . . . , D}. For  < 1, defining a one-step coupling (Xt , Yt )  (Xt+1 , Yt+1 ) such that (1.1) E ((Xt+1 , Yt+1 ) | Xt , Yt ) < (Xt , Yt ),

1 Intro duction Overview Analysis of the convergence rate of finite Markov chains has applications in a variety of fields, including Theoretical Computer Science, Statistical Physics, and Probability Theory. The Coupling Method for proving upper bounds on convergence rate dates back to the seminal work of Doeblin [6]. In computer science, coupling has seen many recent applications in the analysis of Markov Chain Monte Carlo algorithms (e.g., [4, 15]). An important new tool for simplifying and extending the coupling method is the Path Coupling approach of Bubley and Dyer [3] (e.g., see [17]). Briefly, they reduce the problem of proving an upper bound on coupling rate to proving a contraction condition for one step of the evolution of a "partial coupling," defined on a much smaller subset of pairs of configurations. The simple proof constructs a full coupling via iterated composition of partial couplings. We present a natural generalization of this technique, which constructs a full coupling via iterated composition of partial couplings whose lengths may themselves be random variables. (See Section 3 for a precise statement of our theorem.) The potential usefulness of our result lies in providing a technique for simpler construction and analysis of multi-step (non-Markovian) couplings, which are known to be more powerful than
of Computer Science, University of Chicago, Chicago, IL 60637, {hayest,vigo da}@cs.uchicago.edu. Part of this work was done while the second author was visiting the Isaac Newton Institute for Mathematical Sciences, Cambridge, UK. This work was partially supp orted by NSF Grant CCR-0237834.
 Department

for all (Xt , Yt )   ×  where Xt = Yt , immediately implies an arbitrary pair of states coalesces with probability at least 3/4 after ln(4D)/(1 - ) steps. This is called a one-step (or Markovian) coupling since the one-step coupling forms a Markov chain on  ×  with stationary distribution  ×  . There always exists a coupling whose probability of non-coalescence equals the distance from stationarity (c.f. [11]). However, this coupling is often a multi-step (or non-Markovian) coupling. Kumar and Ramesh [14]

Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688

103


give an interesting example where one-step couplings are insufficient, and Hayes and Vigoda [13] recently illustrated the power of multi-step couplings. Defining and analyzing a coupling for all pairs Xt , Yt   is often a difficult task. The path coupling technique simplifies the approach by restricting attention to pairs in a subset S   ×  (assuming the graph (, S ) is connected). It then suffices to define a onestep coupling such that (1.1) holds for all (Xt , Yt )  S . Then, the path coupling theorem constructs, via simple compositions, a one-step coupling satisfying (1.1) for all Xt , Yt  . In many settings, it is natural to expect improvements by considering multiple transitions simultaneously. For a fixed length , the path coupling theorem still applies to the Markov chain defined by P (i.e., the -step evolution of the original chain). However, we would often like to consider variable length . More precisely, may be a random stopping time (cf. [8]) which depends on the evolution and the initial pair (Xt , Yt ). Such an analysis was used by Dyer et al. [8] to prove improved convergence rates of a Markov chain for generating a random k -coloring. Our main theorem generalizes the path coupling technique by allowing partial couplings whose length is a random stopping time. Thus, our main result is a variable length path coupling theorem. This work can be viewed as an improvement (and simplification) of a result of Dyer et al. [8, Theorem 2.2]. The proof of our theorem relies on a novel method for composing variable length couplings. Our composition technique can produce multi-step couplings, unlike the methods of Dyer et al. [8], and Bubley and Dyer [3]. For proving upper bounds on the coalescence time, one can always avoid the use of a variable length coupling by analyzing a sufficiently large fixed length coupling. However, our approach is often simpler and more natural. This is illustrated in the proof of the following result on randomly sampling graph colorings.

arity (see Section 2 for a formal definition). The first significant result was by Jerrum [15], who proved the mixing time is O(n log n) whenever k > 2. Vigoda [17] later proved the mixing time is O(n2 ) whenever k > 11/6, via analysis of a different Markov chain. Dyer et al. [8] subsequently proved O(n log n) mixing time of the Glauber dynamics for k  (2 - ) where = 8 × 10-7 , assuming the input graph is regular with girth g  4 and   14. Their result relied on a variable length path coupling theorem which is significantly weaker than our theorem. Using our new technique, we considerably improve the result of Dyer et al. For any graph with girth g  5 and   0 , where 0 is a sufficiently large constant, we prove the mixing time of the Glauber dynamics is O(n log n) whenever k  1.953. There are significantly stronger results known when  = (log n). Using a multi-step coupling, Hayes and Vigoda [13] recently proved the following result. For all > 0, all graphs with girth g  11 and  = (log n), all k  (1 + ), the mixing time of the Glauber dynamics is O(n log n). That result builds upon earlier work of Dyer and Frieze [7], Molloy [16], and Hayes [12]. 2 Preliminaries Throughout the text, for a finite ergodic Markov chain, we use  to denote the set of states, P to denote the transition matrix, and  the stationary distribution. The total variation distance for a pair of distributions µ and  on  is defined as 1x |µ(x) -  (x)|. dT V (µ,  ) = 2


Our interest is the mixing time mix of the chain: mix = max min{t : dT V (P t (x, ·),  )  1/4}
x

We use the coupling method to bound the mixing time. A t-step coupling is defined as follows. For every (x0 , y0 )  2 , let (X , Y ) = (X (x0 ,y0 ) , Y (x0 ,y0 ) ) be a random variable taking values in t × t . We say (X , Y ) Applications to Graph Colorings We apply our is a valid t-step coupling if for all (x0 , y0 )  2 , the variable length path coupling theorem to improve con- distribution of Xt is P t (x0 , ·) and the distribution of Yt vergence results of a simple Markov chain, known as the is P t (y0 , ·). In other words, viewed individually the tGlauber dynamics, for randomly sampling k -colorings of step evolutions of the chains are faithful copies of the a graph with n vertices and maximum degree . Transi- original Markov chain. tions of the Glauber dynamics randomly recolor a ranA valid coupling gives the following bound, known domly chosen vertex at each step (see Section 5 for a as the Coupling Inequality, on the convergence rate [6] precise definition). The stationary distribution of the (or e.g. [1]). For all x0  , chain is uniformly distributed over (proper) k -colorings dT V (P t (x0 , ·),  )  max Pr(Xt = Yt ) of the input graph whenever k >  + 1. We are intery0  ested in upper bounding the mixing time, which is the Therefore, by defining a valid t-step coupling where all number of transitions until the chain is close to stationinitial pairs have coalesced (i.e., are at the same state)

104


with probability at least 3/4, we have proved the mixing stopping condition which is satisfied determines the precise coupled sequence for Y . In this way, we have time is at most t. no knowledge of Y1 until observing the evolution of X up to the stopping time T . 3 Variable-length Partial Coupling Statement of Results We call S  2 a pathWe can now state our main result. generating set if the graph (, S ) is connected. We + assume S has an associated function d : S  N . Theorem 3. For a variable length partial coupling We extend this to a metric on 2 by setting d(x, y ) (X , Y , T ), let as the weight of the shortest path between x and y in the weighted graph (, S ). Also, let dmax denote the  := max E (d(X , Y )) and M := max T , T T (x0 ,y0 )S (x0 ,y0 )S maximum over all (x, y )  2 of d(x, y ). Definition 1. Let S be a path-generating where M is infinite if the stopping time is unbounded. set. For every (x0 , y0 )  S , let (X , Y , T ) = If  < 1, then the mixing time satisfies (X (x0 ,y0 ) , Y (x0 ,y0 ) , Tx0 ,y0 ) be a random variable l T taking values in  ×  × N. When the subscript n(4dmax ) mix  M (x0 , y0 ) is clear from context, we will omit it. We say 1- (X , Y , T ) is a variable length partial coupling when the following hold. he path coupling theorem of Bubley and Dyer [3] corresponds to the special case when T is always 1, i.e., · Length preservation. |X | = |Y | = T holds with M = 1. probability one, for every (x0 , y0 )  S . Even when there is no good natural upper bound · Faithful copies. For all t  0, for all (x0 , y0 )  S on the stopping time, we can always define a truncated t define a new random variable X t = Xx0 ,y0   version of the given partial coupling, which leads to the by the following experiment. Sample (X , Y ) = following corollary (proved in Section 4.3). ((X1 , . . . , XT ), (Y1 , . . . , YT )). If T  t, then "truncate" by setting X t = Xt . Otherwise, "extend" by Corollary 4. With the notation in Theorem 3, let W choosing X t from the distribution P t-T (XT , ·). We denote the maximum of d(Xt , Yt ) over al l (x0 , y0 )  S say X is a faithful copy if the distribution of X t is and t  T . If  < 1, then the mixing time satisfies , P t (x0 , ·). Define "Y is a faithful copy" analogously. n(4dmax ) We require that, for every (x0 , y0 )  S , X and Y , mix  2M l (1 - ) considered separately, are each faithful copies. Remark 2. A. Since our proofs will only examine the where M satisfies Pr (T > M )  (1 - )/2W . In distance from stationarity at one time, we do not require particular, if  = E (T ) is finite, then that the sequence (X0 , X1 , . . . , Xt ) be a Markov chain, 2 l . W n(4dmax ) only that the distribution of Xt be correct for every mix  2 1- (1 - ) t. We note however, that when a partial coupling does possess this additional structure, then so will the full 4 Pro of of Theorem 3 coupling which we construct in Section 4.1. B. In the above definition, the random variable T is a Our proof is via the coupling method. Let l function of the evolution of the chains up to time T , O n(4dmax ) and thus is a stopping time. Therefore, we are defining N := 1- couplings with respect to a stopping time; most previous works only consider couplings of fixed length. bserve C. The definition allows the partial couplings to be multi-step. Specifically, we can allow T to be a function (4.2) N > log (1/4dmax ) of the initial states (x0 , y0 ) and the first T steps of only one of the chains (X1 , . . . , XT ). The evolution of the In Section 4.1, we construct an M N -step coupling, other chain (Y1 , . . . , YT ) is decided upon reaching the denoted g N , and prove that it is valid. Then in Secstopping time. In particular, the stopping time is a tion 4.2 we prove that g N coalesces with high probabilset of conditions on the evolution of Xt . As soon as ity. The Coupling Inequality establishes the M N upper one of these conditions is met, we stop. The specific bound on mixing time.

105


by a union bound. Our goal is to bound N (S ). We inductively bound i+1 (S ) by conditioning on (i-1)M  (i-1)M , µ, and  . Suppose x0 ,y0 has been defined the distance of (XT , YT ). Observe 0 (S ) = 1. For (x0 , y0 )  S , for all pairs (x0 , y0 )  2 . i For (x0 , y0 )  S , we form xMy0 by running the 0, X variable-length coupling µx0 ,y0 until the stopping time i+1 (x0 , y0 ) = i (XT , YT )Pr (XT , YT | x0 , y0 ), T (recall T  M ). From the resulting states (XT , YT ), ,YT T (i-1)M we then run the coupling XT ,YT for a further (i - 1)M steps. Finally, from (X , Y ), where = T + (i - 1)M , by the definition of  iM , t we evolve the chains using the trivial coupling X ,Y f X or the final t = M - T steps. This defines the coupling  d(XT , YT )i (S )Pr (XT , YT | x0 , y0 ), i xMy0 for all initial pairs (x0 , y0 )  S . 0, T ,YT We extend the coupling from pairs in S to all pairs in 2 by compositions. More precisely, for (x0 , y0 )  i 2 \ S , the coupling xMy0 is constructed by composing by inequality (4.3), 0, iM the couplings  along a shortest (x0 , y0 ) path in the = i (S )E (d(XT , YT ) | x0 , y0 ) graph (, S ). Since the couplings on the path are of  i (S ), identical lengths such a composition is standard (e.g., see [3, 2]). This completes the definition of the couplings by the definition of .  iM . By induction, N  N . Hence by (4.2), We prove by induction on i that for all 0 < i  N , iM 0  is a valid coupling. For  , this is trivial. Fix N  N < (4dmax )-1 . 0 < i  N . Assume by induction that  (i-1)M is a valid coupling from any pair of states in 2 . Now, fix any (x0 , y0 )  S . It will suffice to show that  iM is a valid For a pair (x0 , y0 )  2 and for t = N M , coupling for this initial pair, since the composition of Pr (Xt = Yt )  dmax N  1/4. couplings is a well-defined coupling for the same Markov chain (e.g., see [3, 2]). Focus attention on the random sequence Thus the coupling time is at most t, and by the (X1 , . . . , Xt ) where t = iM . The sequence is gen- Coupling Inequality, this bound also applies to the mixing time.

4.1 Construction of full coupling Denote the variable-length coupling from the hypothesis of the theorem by µx0 ,y0 , (x0 , y0 )  S. Using this variable-length coupling, we will construct a sequence of fixed-length couplings. For all (x0 , y0 )  2 , all 0  i < N , we i define a (iM )-step coupling denoted by xMy0 . 0, We will convert variable-length couplings into fixedlength couplings by adding a trivial coupling at the end. From (Xt , Yt )  2 , the trivial one-step coupling (Xt , Yt )  (Xt+1 , Yt+1 ) is defined as follows. Evolve Xt  Xt+1 according to the Markov chain of interest. If Xt = Yt , set Yt+1 = Xt+1 , otherwise independently evolve Yt  Yt+1 . We call t steps of this process as the t-step trivial coupling, denoted, for initial states t (x0 , y0 )  2 , as x0 ,y0 . By converting the variable-length couplings into fixed-length couplings, we can compose the couplings along a path and define a coupling for all (x0 , y0 )  2 . We construct our sequence of fixed-length couplings in an inductive manner which keeps the trivial coupling at the end. The goal is to coalesce (with sufficiently large probability) before we use the trivial coupling. 0 The coupling x0 ,y0 is a 0-step coupling, and thus i trivial. For i > 0, we construct xMy0 inductively, using 0,

erated via the variable-length coupling up to the stopping time and then is being extended via the original Markov chain (since  (i-1)M is valid and so is the trivial coupling). Therefore, by the definition of the validity of the variable-length coupling, the distribution of Xt is identical to the distribution of our Markov chain. The same argument holds for (Y1 , . . . , Yt ) as well. This proves that  iM is a valid coupling for (x0 , y0 ), which completes the proof. 4.2 Analysis of full coupling It remains to bound the probability of coalescence by time t = N M . For i (x0 , y0 )  2 , let i (x0 , y0 ) denote the probability xMy0 0, does not coalesce. Also, let i (S ) =
(x0 ,y0 )S

max

i (x0 , y0 ).

Note, the maximum is over initial pairs whose partial coupling is defined. For any (x0 , y0 )  2 , we have (4.3) i (x0 , y0 )  d(x0 , y0 )i (S ),

106


4.3 Pro of of Corollary 4 Suppose we are given a partial coupling (X , Y , T ), and an integer M . Define a , , "truncated" partial coupling (X Y T ) by  (X , Y , T ) when T  M ( , , ) (X Y T = (X1 , . . . , XM ), (Y1 , . . . , YM ), M )   otherwise.

It is straightforward to verify the stationary distribution is uniform over the proper k -colorings whenever k >  + 1 (e.g., see Jerrum [15]). In practice, we can consider the chain defined only on proper k -colorings. The extension to assignments is for technical reasons, and its mixing time upper bounds the mixing time of the chain defined only on proper colorings. In the remainder of the section, we use the term colorings to refer to assignments in . , , Clearly, (X Y T ) inherits the property of being a We can now state our theorem on the Glauber partial coupling from (X , Y , T ). dynamics. Recall that by definition, Theorem 5. There exists 0 such that, for every  := max E (d(XT , YT )), graph G = (V , E ) on n vertices having maximum degree (x0 ,y0 )S   0 and girth g  5, and for every k  1.953, the Glauber dynamics on k -colorings of G has mixing time and O(n log n). W := max d(XT , YT ). Hence the corresponding quantities , , ((X Y T )) satisfy W  W and 
:

 ,W

f

or

=

(x0 ,y0 )S

max

E (d(XT , YT
)

We begin with a few definitions. Let S denote pairs X, Y   which differ at exactly one vertex. For Xt , Yt  , denote their Hamming distance by H (Xt , Yt ) = |{z : Xt (z ) = Yt (z )}|.

))


 + W Pr (T > M

Assuming Pr (T > M )  (1 - )/2W , we have   (1 + )/2. Applying Theorem 3 for the partial coupling , , (X Y T ) proves the first half of Corollary 4. The second half of the corollary is now easy. For  = E (T ), define B 2 W M := 1-

Let AXt (z ) = [k ] - |Xt (N (z ))| denote the number of available colors for vertex z in coloring Xt . We use the maximal one-step coupling, originally used by Jerrum [15]. More precisely, consider a pair of colorings (X0 , Y0 )  S . Let v be the single vertex of disagreement (X0 (v ) = Y0 (v )). At every step, both chains update the same vertex z . The coupling at time t is as follows. If z is not a neighbor of v , then both chains choose the same new color Xt+1 (z ) = Yt+1 (z ). y Markov's inequality, Suppose z is a neighbor of v . Without loss of ) generality, assume AXt (z )  AYt (z ). Thus, if z  N (v ) Pr (T > M  (1 - )/2W. and X0 (v )  Yt (N (z )), then Y0 (v )  Xt (N (z )). First, / / choose the random new color Xt+1 (z ) = cX  [k ] \ The result follows by the first half of the corollary. Xt (N (z )). Choose a color c uniformly at random from [k ] \ Yt (N (z )). The new color Yt+1 (z ) = cY is defined 5 Sampling Colorings as follows, For a graph G = (V , E ) with maximum degree , let   be the set of all assignments  : V  K where X0 (v ) if z  N (v ), c = Y0 (v ), X0 (v )  Yt (N (z )) /  K = [k ] = {1, . . . , k }. The Glauber dynamics (heati f z  N (v ), c = Y0 (v ), X0 (v )  Yt (N (z )) bath version) has state space  and transitions defined cY = c   as follows. From Xt  , cX otherwise. · Choose vertex z uniformly at random from V . We define T to be the first time the Hamming distance changes; i. e., the least t > 0 at which either of · Choose c uniformly at random from [k ] \ Xt (N (z )), i.e., c is a random color not appearing in the the following occurs: neighborhood of z . · Vertex v is selected for recoloring. · Set Xt+1 (w) = Xt (w) for all w = z . · A neighbor z of v is selected for recoloring and · Set Xt+1 (z ) = c. (cX = Y0 (v ) or cY = X0 (v )).

107


Clearly T is a stopping time, so Corollary 4 applies. A simpler version of our stopping time was analyzed by Dyer et al. [8]. It remains to prove E (H (XT , YT )) < 1 -  for some fixed  > 0. Our proof will rely on a high-probability lower bound for the number of available colors at a vertex w after the Glauber dynamics has run for a certain amount of time, an idea first exploited by Dyer and Frieze [7]. We will require the following time-dependent version, whose proof is a straightforward extension of methods in [12].

Proof of Theorem 5. Recall that the stopping time T occurs when either v is recolored, which always reduces the Hamming distance to zero, or when w  N (v ) is selected, and either YT -1 (v ) is chosen for XT (w) or XT -1 (v ) is chosen for YT (w), which increases the Hamming distance to two. It follows that E (H (XT , YT )) = 2Pr (H (XT , YT ) = 2). Decomposing into the "good" and "bad" cases, we have H . Pr (H (XT , YT ) = 2)  Pr (B ) + Pr (XT , YT ) = 2, B

Lemma 6. For every  > 0, for al l sufficiently large By Corollary 8, Pr (B )  exp(- 2 /200).   0 ( ), for every graph G = (V , E ) having girth We can expand  5 and maximum degree , for k > (1 +  ), for =H every t > 0, w  V , Pr (XT , YT ) = 2, B  e /) 2 t H PT . Pr (A < (1 -  )k xp(- k )  exp(- /100), Pr (Xt , Yt ) = 2 | T  t, B r  t, B where A = AXt (w) = k - |Xt (N (v ))|,  =  - exp(-t/n), k = k - exp(-t/n).
=0

Observe that Pr (H (Xt , Yt ) = 0 | Xt-1 , Yt-1 ) = 1/n. Since Jerrum's coupling pairs the color choices YT -1 (v ) and XT -1 (v ) as often as possible on N (v ), we have

Lemma 6 is a slight generalization of an earlier w result of Hayes [12, Lemmas 11, 12], which assumed 1 Pr (H (Xt , Yt ) = 2 | Xt-1 , Yt-1 ) = , t = (n) and  = (log n). Although the proof nA(w) N (v ) is essentially the same, we include it in Section 5.1 for completeness. We now continue with the proof of where Theorem 5. Definition 7. With the notation of Lemma 6, we define the "bad" event, B := {(t  T , w  N (v )) A < (1 -  )k exp(- /k )}. Denote the complementary "good" event by B . Corollary 8. For every  > 0, for al l sufficiently large   1 ( ) Pr (B )  exp(- /200).
2

A(w) := k - max{Xt-1 (N (w)), Yt-1 (N (w))}. Noting that T  t if and only if H (Xs , Ys ) = 1 for all s < t, it follows that Pr T  t, B  = Pr st
=1

B st
=1

Pr 1

H

(Xs , Ys ) = 1 | T  s, B ,

1-

n

+

g (s) (1 -  )n

 Proof. Since B can only occur when the updated vertex g (s) = ( . k s) exp(- (s)/k (s)) is within distance 2 of v , we can restrict our attention to such times. With probability  1 -  , T  Putting this all together, and applying the definition of n/ and at most O(2 / ) such times occur before B we have T . By Lemma 6, the probability that the bad event H  occurs at a particular time t and vertex w is at most Pr (XT , YT ) = 2, B 1 . exp(- 2 /100), assuming  is sufficiently large with t t g (t) s g (s) respect to  . Taking a union bound over the times of 1- + (1 -  )n =1 n (1 -  )n interest and over w  N (v ) gives the desired result. =0

where

108


Taking limits as  tends to 0 and  tends to infinity, For almost all outcomes of F , we will show that, the right-hand side tends to conditioned on that outcome, the variables Xt (z ), z  N (w) become nearly independent, and for most z  -x d  N (w), the range of Xt (z ) remains large and its disf (x) exp 1 + f (y ) dy x, tribution approximately uniform. Then we will apply 0 0 Lemma 10. where Proof of Lemma 6. Let c = ln 1/. 1 , 1 - exp(-x) Since the events "vertex z is not selected at any time f (x) = exp  - exp(-x)  - exp(-x) in T " are negatively associated, and the probability of each is (1 - 1/n)t < exp(- min{t/n, c})}, Chernoff 's and  = k /. Using Maple, we find that this integral bound applies (cf. [10, 12]), implying is strictly less than 1/2 when  > 1.953. Hence for any   1.953, there exist suitably small Pr ( - |M | > (exp(- min{t/n, c}) +  )) values of  such that for all sufficiently large values of < exp(- 2 /2).  and k >  , we have shown E (H (XT , YT )) < 1 -  . Applying Corollary 4 to this partial coupling comSince there are always at least k -  colors available pletes the proof. Note that in the notation of Corollary 4 for any vertex being recolored, it follows that the probwe have  = 1 -  ,   n, W = 2, and dmax = n. ability any particular color is missed by all neighbors of x during the time interval [t0 + 1, t] is at least 5.1 Pro of of Lemma 6 Following [12], we define the (1 - /((n - 1)(k - )))t-t0  exp(-c/(k - )). following experiment. Definition 9. Let F denote the experiment which Since these events are negatively associated, Chernoff 's reveals the entire sequence of selected vertices, as well bound implies as all selected colors except those on neighbors of w. Let Pr (r(x) < exp(-c/(k - ))(k - ) -  )) c = c( ) be a constant to be specified later. Let < exp(- 2 /2). T = [max{1, t - c(n - 1)}, t] An easy counting argument shows, for every , c Let M denote the set of neighbors of w that are selected n -  ec Pr (R  )  for recoloring at least once during T . For every x  F n M , let r(x) denote the size of the range of Xt-1 (x), conditioned on the outcome of F . In other words, r(x) or =  2 (k -)/4 and  sufficiently large as a function (N (x)), is the number of colors missed by the sets Xt of  , this probability is < exp(- 2 ).  t [t0 , t], where t0 is the time of the last recoloring of Similarly, for every x  N (w), x. c e n c Pr (I (x)  )  /n)  We will use the following lemma of Hayes [12, F ( Lemma 21], which we have restated for our application. or = e2 c and  sufficiently large, this probability Lemma 10. Using the notation of Lemma 6, is < exp(-c). Taking a union bound over all x  N (w), we find A < that with small probability of error  4 exp(- 2 /2), Pr < L(1 - p)|M |/Lp -   | F the outcome of F satisfies exp(R/(k - ) -  2 /2), |M |  (1 - exp(-t/n) - 2 ) where (x  N (w)) r(x)  exp(1/ )/ (x  N (w)) I (x)  e2 c L := k - ( - |M |) p  exp(1/ 2 )/ 1 p := max exp(I (z )/(k - )), R   2 (k - ). z M r (z ) I (z ) := # recolorings of vertices of N (z ) in T , R := # recolorings of w in T . Plugging these into Lemma 10 and simplifying gives the claimed result.

109


References

[1] D. J. Aldous. Random walks on finite groups and rapidly mixing Markov chains. In S´minaire de Probae bilities XVII, 243­297. Springer-Verlag, 1983. Lecture Notes in Mathematics 986. [2] J. van den Berg and R. Brouwer. Random sampling for the monomer-dimer model on a lattice. J. Math. Phys. 41(3):1585­1597, 2000. [3] R. Bubley and M. Dyer. Path Coupling: A Technique for Proving Rapid Mixing in Markov Chains. In 38th Annual Symposium on Foundations of Computer Science, 223­231, 1997. [4] R. Bubley, M. E. Dyer, and M. R. Jerrum. An elementary analysis of a procedure for sampling points in a convex body. Random Structures & Algorithms, 12(3):213­235, 1998. [5] A. Czuma j and M. Kutylowski, Delayed path coupling and generating random permutations. Random Structures & Algorithms, 17(3-4):238­259, 2000. [6] W. Doeblin. Expos´ de la theorie des cha^nes simples e i constantes de Markov ` un nombre fini d'´tats. Rev. a e Math. Union Interbalkanique, 2:77­105, 1938. [7] M. Dyer and A. Frieze. Randomly colouring graphs with lower bounds on girth and maximum degree. In Proceedings of the 42nd Annual Symposium on Foundations of Computer Science, 579-587, 2001. [8] M. Dyer, L. Goldberg, C. Greenhill, M. Jerrum, and M. Mitzenmacher. An extension of path coupling and its application to the Glauber dynamics for graph colorings. SIAM J. Comput. 30(6):1962­1975, 2001. [9] M. Dyer, C. Greenhill, and M. Molloy Very rapid mixing of the Glauber dynamics for proper colorings on bounded-degree graphs. Random Structures Algorithms 20(1):98­114, 2002. [10] D. Dubhashi and D. Ranjan. Balls and bins: a study in negative dependence. Random Structures and Algorithms, 13(2):99­124, 1998. [11] D. Griffeath, A maximal coupling for Markov chains, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 31:95­ 106, 1974/75. [12] T. P. Hayes. Randomly coloring graphs with girth at least five. In Proceedings of the 35th Annual ACM Symposium on Theory of Computer Science, 2003. [13] T. P. Hayes and E. Vigoda. A non-Markovian coupling for randomly sampling colorings. To appear in 44th Annual Symposium on Foundations of Computer Science, 2003. [14] V. S. Anil Kumar and H. Ramesh. Markovian coupling vs. conductance for the Jerrum-Sinclair chain. In 40th Annual Symposium on Foundations of Computer Science, 241­252, 1999. [15] M. R. Jerrum. A very simple algorithm for estimating the number of k-colorings of a low-degree graph. Random Structures and Algorithms, 7(2):157­165, 1995. [16] M. Molloy. The Glauber dynamics on colorings of a graph with high girth and maximum degree. In

Proceedings of the 34th Annual ACM Symposium on Theory of Computer Science, 91-98, 2002. [17] E. Vigoda. Improved bounds for sampling colorings. In 40th Annual Symposium on Foundations of Computer Science, 51-59, 1999.

110