On the diameter of the symmetric group: polynomial bounds L´szl´ Babai Robert Beals Akos Seress ao , ,´ Abstract We address the long-standing conjecture that all permutations have polynomially bounded word length in terms of any set of generators of the symmetric group. The best available bound on the maximum required word length is exponential in n log n. Polynomial bounds on the word length have previously been established for very special classes of generating sets only. In this paper we give a polynomial bound on the word length under the sole condition that one of the generators fix at least 67% of the domain. Words of the length claimed can be found in Las Vegas polynomial time. The proof involves a Markov chain mixing estimate which permits us, apparently for the first time, to break the "element order bottleneck." As a corollary, we obtain the following averagecase result: for a 1 - fraction of the pairs of generators for the symmetric group, the word length is polynomially bounded. It is known that for almost all pirs of generators, the word length is less than a exp( n ln n(1 + o(1))). 1 Intro duction which has G as its vertex set; the pairs {{g , g s} : g G, s S } are the edges. Let diam(G, S ) denote the diameter of (G, S ). The diameter of Cayley graphs has been studied in a number of contexts, including interconnection networks, expanders, puzzles such as Rubik's cube and Rubik's rings, card shuffling and rapid mixing, random generation of group elements, combinatorial group theory. Even and Goldreich [EG] proved that finding the diameter of a Cayley graph of a permutation group is NP-hard even for the basic case when the group is an elementary abelian 2-group (every element has order 2). Jerrum [Je] proved that for directed Cayley graphs of permutation groups, to find the directed distance between two permutations is PSPACE-hard. No approximation algorithm is known for distance in and the diameter of Cayley graphs of permutation groups. Strikingly, the question of the diameter of the Rubik's cube Cayley graph appears to be wide open (cf. [Ko]). We refer to [BHKLS] for more information about the history of the diameter problem and related results and to the survey [He] for applications of Cayley graphs to interconnection networks. For G = Sn and G = An we conjecture that diam(G, S ) is polynomially bounded in n for all sets S of generators: Conjecture 1.1. [BS1] There exists a constant C such that for al l n and al l sets S of generators of G = Sn or An , diam(G, S ) nC . This conjecture would imply a quasipolynomial upper bound on the diameters of all Cayley graphs of all transitive permutation groups [BS2]. The Conjecture has been verified for very special classes of generating sets. Driscoll and Furst [DF] proved an O(n2 ) bound for the case when all generators are cycles of bounded lengths and McKen- Let G be a finite group and S a set of generators of G. We consider the undirected Cayley graph (G, S ) of Computer Science, University of Chicago, 1100 East 58th Street, Chicago, IL 60637, and Mathematical Institute of the Hungarian Academy of Science. Email: laci@cs.uchicago.edu. This research was partially supp orted by the NSF. IDA Center for Communications Research - Princeton, 805 Bunn Drive, Princeton, NJ 08540. Email: beals@idaccr.org. Department of Mathematics, Ohio State University, 231 W. 18th Avenue, Columbus, OH 43210. Email: akos@math.ohio-state.edu. Partially supp orted by the NSF. Department Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688 1108 zie [McK] gave a polynomial bound for the case when the generators have bounded degree. (The degree of a permutation is the number of elements actually moved.) We note that McKenzie's exponent is (k ) where k is the bound on the degree. We extend these results to a large class of generating sets in the hope that this may lead to settling the Conjecture. Theorem 1.1. Let be a positive constant. Let S be a set of generators for G = Sn or G = An . Assume that S contains an element of degree n/(3+ ). Then diam(G, S ) = O(nC ) where C is an absolute constant (does not depend on ). Moreover, between any pair of elements of G, a path of length O(nC ) can be found in Las Vegas polynomial time. now able to overcome this obstacle. Corollary 1.1. Fix k 2 and > 0. Let G be either Sn or An . Let S be a random subset of size k in G and let G1 be the group generated by S . Then, with probability > 1 - we have diam(G1 , S ) nC for a constant C = C ( ). 2 Pro of of the worst-case b ound: a Markov chain argument We shall use the following fact. We explain the concepts involved after the statement. Fact 2.1. Let X be a connected undirected graph on n vertices, with maximum degree and diameter d. Then the strong mixing time for a lazy random walk on X (starting from any vertex) is O(dn log n). We did not attempt to optimize the exponent C ; without effort, our proof yields a diameter upA lazy random walk on the graph means a per bound of n7 (log n)O(1) . The bound reduces to stochastic process where a particle moves from vertex n6 (log n)O(1) if the number of generators is bounded. to vertex; at each step, with probability 1/2 the The best upper bound known to hold for the particle does not move; the remaining probability 1/2 diameter of al l Cayley graphs of An and Sn is is split evenly between the neighbors. On a connected exp( n ln n(1 + o(1))). Except for the function im- graph, lazy random walks are ergodic and therefore the plicit in the o(1) term, this bound is the same as the distribution of the particle approaches the stationary largest order of elements of Sn [La]; indeed one of the distribution. operations used in the proof of the diameter bound is In the stationary distribution, each vertex is visraising an element to an arbitrary power, which makes ited with a frequency proportional to its degree. So the order of elements a bottleneck. for regular graphs (as will always be the case in our Theorem 1.1 seems to be the first result to overcome the "element order bottleneck." To achieve this, we have to avoid taking powers of permutations as a degree-reduction tool. Instead we increase the degreereducing power of the commutator operation by applying it to carefully chosen conjugates and -1 ; the conjugating permutation is obtained as the result of a polynomial-length random walk. applications), the stationary distribution is uniform. The strong mixing time of a random walk means the number of steps needed for the random walk to reach a distribution close to the stationary distribution in the following sense: each state x is reached with probability p(x)e± where p(x) is the stationary probability. Here 0 < 1/2; the influence of the specific value of on the strong mixing time Our worst-case result brings us close to proving a estimates is a factor of log(1/ ) which we can ignore polynomial upper bound for the "typical case." First as long as is bounded away from zero. we need to recall Dixon's celebrated result: almost The Fact remains true if we allow loops and all pairs of permutations in Sn generate either Sn or multiple edges. The Fact follows from a result of An [Dix] (cf. [Bo, Ba2]). Landau and Odlyzko [LO] stating that the eigenvalue For random pairs of generators, the best diameter bound to-date states that the diameter is almost always at most nln n(1/2+o(1)) [BH]. This expression also describes the order of almost all permutations [ET], so this result again suffers from the "element order bottleneck." In 99% (but not in almost all) cases we are gap of the transition matrix is at least (d( + 1)n)-1 . The eigenvalue gap refers to the quantity 1 - max || where the maximum is taken over all eigenvalues = 1 of the transition matrix. Note that 1 is a simple eigenvalue and the lazyness of the walk guarantees that || < 1 for all other eigenvalues. 1109 By standard arguments, the Landau-Odlyzko eigenvalue gap implies an O(dn log n) strong mixing time. Choose with at most average value of |A A |. A permutation group G is t-transitive if for any The proof of Theorem 1.1 is based on Lemma 2.2 pair of ordered t-tuples of distinct elements of the below. We first state Lemma 2.1 in order to illustrate permutation domain, (x1 , . . . , xt ) and (u1 , . . . , ut ), the technique on a simple instance. there exists G such that for all i, x = ui . i Lemma 2.1. Let G = S be a transitive permutation group of degree n. Let > 0 be fixed. Let A [n], |A| = k . Then there exists a word of length O(n2 |S | log n) over S S -1 such that |A A | (k 2 /n)(1 + ). Lemma 2.2. Let G = S be a (t + 1)-transitive permutation group of degree n. Fix > 0. Let (x1 , . . . , xt ) and (u1 , . . . , ut ) be t-tuples of distinct elements of [n]. Let A [n], |A| = k . Then there exists a permutation G expressible as a word of length O(tn2t+2 |S | log n) over S S -1 such that |A A | t + (k 2 /n)(1 + ); Note that if is chosen at random uniformly from (a) 2 G then the expected size of A A is exactly k /n x = ui (i = 1, ..., t), (cf. [BE]). The content of the Lemma is that we (b) i can produce an intersection nearly this small by a permutation which is a short word in the generators. where t is the number of pairs (xi , ui ) which both We shall use a short random walk to find . belong to A. First we note that without loss of generality, here Note that in our application below, t = 2. and later we may assume |S | < 2n by throwing away redundant generators. (It is shown in [Ba1] that the Proof. Let be the following graph: the vertices of are the ordered (t+1)-tuples of distinct elements of [n]; longest subgroup chain in Sn has length < 2n). an edge corresponds to a transition between vertices Proof. Consider the connected undirected graph under some member of S S -1 . This is a connected (possibly with loops and multiple edges) consisting of undirected graph. the vertex set [n] and the edges {i, i }, where i [n] Let s be the mixing time of the lazy random and S S -1 . Note that this graph is regular of walk on , starting from any vertex with parameter degree 2|S |. This graph is also connected because G = ln(1 + )/2, so /2. By the Fact, s = is transitive. O(tn2t+2 |S | log n) since is a regular graph of degree Let s denote the strong mixing time of the lazy 2|S | with n(n - 1) · · · (n - t) < nt+1 vertices). random walk on . So the lazy random walk on Let be a lazy random word of length s over , starting from any vertex, is -nearly uniformly S S -1 . Let B denote the event that satisfies distributed over [n] after s steps, i. e., each vertex has probability (1/n)e± chance to be visited at time condition (b). Then s. Let us choose := ln(1 + ), so . We have s = O(n2 |S | log n) by the Fact. Prob(B ) = (1/n(n - 1) · · · (n - t + 1))e± , Let be a lazy random word of length s over S S -1 . Then for any x [n], and for any x {x1 , . . . , xt } and u {u1 , . . . , ut }, Prob(x A) (k /n)e = (k /n)(1 + ). Prob(B and x = u) = (1/n(n - 1) · · · (n - t))e± . Adding these probabilities over all x A, we find that E (|A A |) (k 2 /n)(1 + ). Therefore for the conditional probability we have Prob(x = u|B ) = (1/(n - t))e±2 . 1110 Hence, as in the proof of Lemma 2.1, we obtain the total word length bounded by n6 |S |(log n)O(1) , the following bound on the conditional expectation of completing the proof of the diameter bound. |A A | : The justification of the algorithmic claim in Theorem 1.1 follows the lines of the proof of the diameter bound. The first step is to discard redundant E (|A = A ||B ) t + ((k - t )2 /n)e2 generators; this can be done in polynomial time using Sims's algorithm ([Si, FHL, Kn], cf. Chaper 4 in t + ((k - t )2 /n)(1 + ). [Se]). Let denote the intersection size produced by our random walk method. For any > 0, the probaTherefore we may choose to satisfy B as well as bility that > (1 + )E ( ) is less than 1/(1 + ) by the inequality Markov's inequality. Let us repeat the random walk until (1 + )E ( ) holds; it follows from the preceding observation that we shall succeed in an expected |A A | t + ((k - t )2 /n)(1 + ). O(1/ ) trials. So the degree reduction will proceed at the same rate as in the proof of existence above. i For a permutation we use supp( ) to denote the More precisely, we need to consder the conditional expectation of conditioned on xi = ui (i = 1, 2); the supp ort of , i. e., the set of elements moved by . probability that this happens is O(n-2 ), so the cost Note that |supp( )| is the degree of . of taking this condition into account is an additional Proposition 2.1. Let , be permutations and A = factor of O(n2 ) in the running time. -1 supp( ). Let x A and y = x . If x A and -1 y A then and do not commute. 3 Average case Proof. Let = . Since y A, we have -1 -1 y = y , i.e., y = y . In other words, x = x . Assume now that and commute. We infer that -1 x = x and therefore x = x = x . Hence -1 -1 -1 =x , i.e., x A, contrary assumption. x -1 Lemma 3.1. For every , > 0 there exists C = C ( , ) such that al l but an fraction of the n! permutations of [n] have a set of at most C cycles whose combined length is greater than n(1 - ). The proof of Theorem 1.1 now follows. We use Lemma 2.2 with A = supp( ), t = 2, x1 , u1 A, Proof. Follow the argument of the proof of Lemma u2 := u A, and x2 A. Then, by Proposition 2.1, 3.1 in [BH]; replace the reference to the Central Limit 1 condition (b) implies that and do not commute. Theorem by Chebyshev's Inequality. Note that we have t = 1. To prove Theorem 1.1, we use Lemma 2.2 repeatedly, starting from some of degree < n/(3 + ). In each round we replace by the commutator [ , ] where satisfies conditions (a) and (b) with A = supp( ); we choose xi and ui as in the preceding paragraph. Condition (b) guarantees that the new is not the identity. Condition (a) implies that the degree of decreases rapidly: if the degree of the old is k and the degree of the new is then /n < 3/n+3(k /n)2 (1+ ) for an arbitrarily small fixed . Since at the start, k /n < 1/(3 + ) for a fixed > 0, in O(log log n) rounds, we shall reach k = 3. Let now and be two random permutations and let G1 be the group generated by and . Setting = 0.33, we find that by the Lemma, with probability > 1 - , the permutation has cycles of lengths n1 , . . . , nk whiere k C for some constant C = C ( ) n C such that ni 0.67n. Let N = i