GENERIC Q UA N T U M F O U R I E R T R A N S F O R M S
Cristopher Moore University of New Mexico moore@cs.unm.edu Daniel Rockmore Dartmouth College rockmore@cs.dartmouth.edu

Alexander Russell University of Connecticut acr@cse.uconn.edu

Abstract The quantum Fourier transform (QFT) is the principal ingredient of most efficient quantum algorithms. We present a generic framework for the construction of efficient quantum circuits for the QFT by "quantizing" the highly successful separation of variables technique for the construction of efficient classical Fourier transforms. Specifically, we use Bratteli diagrams, Gel'fand-Tsetlin bases, and strong generating sets of small adapted diameter to provide efficient quantum circuits for the QFT over a wide variety of finite Abelian and non-Abelian groups, including all group families for which efficient QFTs are currently known and many new group families. Moreover, our method provides the first subexponential-size quantum circuits for the QFT over the linear groups GLk (q), SLk (q), and the finite groups of Lie type, for any fixed prime power q. 1 Introduction Peter Shor's seminal discovery of efficient quantum algorithms for factoring and discrete logarithm [25] relies crucially on the fact that the Fourier transform over the cyclic group Zn can be carried out efficiently on a quantum computer, even when n is exponentially large. This has motivated broad interest in the problem of efficient quantum computation over arbitrary groups; see e.g., [3, 9, 11, 13, 14, 20, 21, 27]. While this research effort has already become quite ramified, two related themes have emerged: (i.) development of efficient quantum Fourier transforms, and (ii.) development of efficient quantum algorithms for the hidden subgroup problem. The complexity of these two problems appears to be intimately related to the structure of the group in question: while quantum Fourier transforms and hidden subgroup
Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688

problems over Abelian groups are well-understood, for non-Abelian groups our understanding of these problems remains embarrassingly sporadic. Aside from their natural appeal, these lines of research are motivated by their direct relationship to the graph isomorphism problem: an efficient solution to the hidden subgroup problem over the (non-Abelian) symmetric groups would yield an efficient quantum algorithm for graph isomorphism. Over the cyclic group Zn , the quantum Fourier transform is the unitary transformation taking the state zZn f (z) |z to the state Zn f^() | , where f : Zn  C is a function with f 2 = 1 and f^() = 1  z f (z) e2iz/n is the familiar discrete Fourier transn form at the frequency . Over an arbitrary finite group G, this analogously refers to the transformation taking the state zG f (z) |z to the state G f^()i j |, i, j , where ^ f : G  C, as before, is a function with f 2 = 1 and f^()i j denotes the i, j entry of the Fourier transform at the representation . This is explained further in Section 2. While there is no known explicit relationship between the quantum Fourier transform and the hidden subgroup problem over a group G, all known efficient hidden subgroup algorithms rely on an efficient quantum Fourier transform. Indeed, it is fair to say that the quantum Fourier transform--the so-called transform and measure approach--is the only known non-trivial quantum algorithmic paradigm for such problems. In this article we focus on the construction of efficient quantum Fourier transforms. Our research is motivated by dramatic progress over the last decade in the theory of efficient classical Fourier transforms, e.g. [4, 5, 8, 18, 22]. These developments have provided a collection of techniques which, taken together, yield a uniform framework for the efficient computation of Fourier transforms over a wide variety of important families of groups. These include, for example, the finite groups of Lie type (properly parametrized) and the symmetric groups.

778


Our main result is an adaptation to the quantum setting of the most successful and general of these techniques, the "separation of variables" approach [17, 18]. While almost all efficient classical Fourier transforms are divide-and-conquer algorithms, which recursively define the Fourier transform in terms of the transform over a tower of subgroups, this approach uses adapted bases and generating sets of small adapted diameter to streamline this process considerably. Specifically, we define a broad class of polynomially uniform groups and show T H E O R E M 1 . 1 . If G is a polynomially uniform group with a subgroup tower G = Gm > Gm-1 > · · · > {1} with adapted diameter D, maximum multiplicity M , and maximum index I = maxi [Gi : Gi-1 ], then there is a quantum circuit of size poly(I × D × M × log |G|) which computes the quantum Fourier transform over G. This quantifies the complexity of the quantum Fourier transform in exactly the same fashion as Corollary 3.1 of [17] does for the classical case. In fact, for many of the group families we study, the quantum and classical circuit complexities of the Fourier transform differ by a factor of |G|. We extend this class further by showing that it is closed under a certain type of Abelian extension which may have exponential index. Our results give efficient QFTs--that is, circuits of polylog(|G|) size--for many groups, including: (i.) the Clifford groups CLn ; (ii.) symmetric groups, recovering Beals' algorithm [3]; (iii.) wreath products G Sn where |G| = poly(n); (iv.) metabelian groups (semidirect products of two Abelian groups) including metacyclic groups such as the dihedral and affine groups, recovering the algorithm of Høyer [13]; (v.) bounded extensions of Abelian groups such as the generalized quaternions, recovering the algorithm of ¨ Puschel et al. [21]. Thus we provide QFTs for a number of new group families, and place existing QFTs in a uniform framework. In addition, our methods give the first subexponential size quantum circuits for the linear groups GLk (q), SLk (q), PGLk (q), and PSLk (q) for fixed prime power q, finite groups of Lie type, and the Chevalley and Weyl groups. The paper is structured as follows. Sections 2 and 3 briefly summarize the representation theory of finite groups, the Bratteli diagram, and adapted bases. We give our algorithms in Section 4 along with a list of group families for which our techniques provide efficient circuits for the QFT. We conclude with open problems in Section 5.

2 Representation theory background Fourier analysis over a group G consists of expressing arbitrary functions f : G  C as linear combinations of basis functions which reflect the group's structure and symmetries. If G is Abelian, these are the characters of G, i.e., the homomorphisms of G into C; for a general group, they are the irreducible matrix elements. Then the Fourier transform is the change of basis from the basis of delta functions to the basis of irreducible matrix elements. In order to be precise we need the language of (finite) group representation theory; see, e.g., Serre [24] for an excellent introduction. A representation  of a finite group G is a homomorphism  : G  U(V ), where U(V ) denotes the group of unitary linear operators on a finitedimensional vector space V whose dimension we denote d . Once we fix an orthonormal basis for V , each (g) can be associated with a d × d unitary matrix, which is called a matrix representation of G; we will blur the distinction between  and the corresponding matrix representation where this does not cause confusion, but in fact as we will see the choice of basis is extremely 2 important. Each of the d functions i j (g) = [(g)]i j is called a matrix element of ; note that while  is a homomorphism, in general i j is not. A matrix representation  of G on V is called irreducible if the only subspaces it preserves are the trivial one, {0}, and V itself. This is equivalent to the statement that there is no change of basis that simultaneously gives a block diagonalization (of a given shape) of (g) for all g. Otherwise the representation is said to be reducible. The irreducible representations will play a role in the theory analogous to that of the characters of an Abelian group. Two representations  and  are equivalent if they differ only by a change of basis, so that for some fixed unitary matrix U , (g) = U -1 (g)U for all g  G. Up to equivalence, a finite group G has a finite number of irreducible representations equal to the number of its conju^ gacy classes. For a group G, we let G denote a collection of representations of G containing exactly one from each isomorphism class of irreducible representations. For two complex-valued functions f1 and f2 on a group G, there is a natural inner product f1 , f2 given 1 by |G| g f1 (g) f2 (g) . With respect to this inner product, and once we have chosen a basis for each V , the matrix elements of the irreducible matrix representations form an orthonormal basis for the vector space of complex-valued functions on G. Since this space is |G|-dimensional, this implies the following important relationship between |G| and the dimensions of the irreducible representations:
2  d = |G|

.

^ G

779


We are now equipped to give the general definition of the Fourier transform over arbitrary groups. Marvelously, this definition possesses many of the properties of the Fourier transform over Zn that we know and love; for instance, it transforms convolution into (matrix) product. D E FI N I T I O N 1 . Let f : G  C and let  : G  U(V ) be a matrix representation of G. The Fourier transform of f at , denoted f^(), is the matrix d f^() = |G| 


e0 e1 e2 e3 e4 e5

·NNN ·MMM

NNN g NN MMM NNe0O f ss · O g ·KKK MMMMMs OOOO O KKssss M o O1 Kye · sss KKqqqqe1 owoooo· K s ·s qqxqq KKK ooo qq pp· q ·q pppp e2 w ppp ·p
·UUUUU(3) ii·UUUUU(2) LLi ·iiL rr·LLL ·

(4) (3,1) (2,2)

f (g)(g) .

gG

We typically restrict our attention to f^() where  is irreducible. The Fourierd transform is clearly linear in f ; with the constants  / |G| we use here, it is in fact unitary, taking the |G| complex numbers f (g) gG to a total of 2 ^  d = |G| complex numbers organized into |G| matrices with varying dimensions d . We can express the orthonormality of the matrix elements as follows. For any pair of matrix representations ^ ,   G, one form of Schur's lemma [24] gives 0  = if    , = (2.1) i j , kl 1 ik  jl if  =  . d

LL rrr LL(1) 2, r·L(LL1) rrr· LL rr rr Ur (2,1,1) ·rUUUU · iiiii1,1) ( iiiii· (1,1,1,1) ·
(1,1,1)

·


Figure 1: The Bratteli diagrams for the subgroup towers Z6 > Z3 > 1 (top) and S4 > S3 > S2 > 1 (bottom). Cyclic groups of order n have representations indexed by the integers mod n, and (assuming m|n) the representation corresponding to j restricts to the representation corresponding to j mod m. The lower diagram uses the well-known correspondence between irreducible representations of Sn and partitions of n. In this case restrictions from Sn to Sn-1 are determined by those partitions obtained via the decrement of a part of the original partition.

3 Making divide-and-conquer feasible: Bratteli Thus allows us to invert the Fourier transform, giving the diagrams, Gel'fand-Tsetlin bases, and adapted Fourier inversion formula: diameters d The classic Cooley-Tukey Fast Fourier Transform relies  .  on the fact that the cyclic group Z2k can be decomposed f (s) =  tr (s) f^()-1 |G| ^ into a tower of subgroups: G A reducible matrix representation  : G  U (V ) can always be decomposed into a direct product of irreducible representations. Specifically, there is a basis of V in which (g) is block diagonal with the same block structure for all g, where the ith block of (g) is precisely i (g) for some irreducible matrix representation i . In this L case we write  = i i . The number of times a given ^ i  G appears in this decomposition is the multiplicity of i in ; denoting this multiplicity wi , we will write  = w1 1 . . . wr r . A representation  of a group G is also automatically a representation of any subgroup H . We refer to this restricted representation on H as |H . Note that in general, representations that are irreducible over G may be reducible when restricted to H . Z2k > Z2k-1 > · · · > Z4 > Z2 > Z1 = {1}

The algorithm works recursively, by calculating the FFT for each subgroup in the tower, and then combining the results from that subgroup's two cosets with a "twiddle factor" to form the FFT at the next level up. Almost all efficient classical algorithms for the Fourier transform work in this way. However, in the nonAbelian case, making this divide-and-conquer approach concrete is far from trivial. Even if the group has a natural subgroup tower, we need to choose a set of bases for the representations which allows us to embed the cosets of each subgroup in the one above it in an efficient way. Furthermore, we need to choose a set of generators into which we can factor group elements efficiently, and our choice of bases should make the matrix representations Remark. The familiar Discrete Fourier Transform (DFT) for these generators sparse and highly structured, so that corresponds to the case G = Zn . Here representations are they can be multiplied together efficiently. Finally, in the all one-dimensional, and the Fourier transform is an n × n quantum setting, we will have to write the resulting transVandermonde matrix whose entries are nth roots of unity. form as a product of elementary unitary operations.
3

780


Luckily, there are principled ways to choose these bases and these generating sets. These techniques allow us to construct an efficient classical Fourier transform from the following ingredients: (i.) a tower of subgroups, by which the Fourier transform on G can be built recursively as an accumulation of Fourier transforms on increasingly larger subgroups; (ii.) a natural indexing scheme for the representations given by paths in the Bratteli diagram corresponding to that subgroup tower, which in turn provides a convenient basis for each representation; and finally (iii.) a factorization of group elements in terms of a basic set of generators, which, when judiciously chosen, provide a factorization of the Fourier transform as a product of structured (direct sums of tensor products) and sparse matrices. The complexity of the resulting algorithm can then be derived in terms of the basic representation-theoretic and combinatorial data of the subgroup tower, the Bratteli diagram, and the generating set. We describe the recipe by which these ingredients are made into efficient classical transforms in the next two sections. 3.1 Bratteli diagrams and Gel'fand-Tsetlin bases Much of Abelian Fourier analysis is simplified by the fact ^ that the set of characters G = { : G  C}, also called the dual, forms a group isomorphic to the original group G. Furthermore, this isomorphism provides a natural indexing of the irreducible representations, and thus the matrix elements of the Fourier transform. However, in the general case there is no immediate indexing scheme for the ^ dual G and the landscape is further complicated by the absence of a canonical basis for the (now multidimensional) representations. Indeed, where efficient Fourier analysis is concerned, not all bases are created alike! A fairly general methodology for the construction of group FFTs, the "separation of variables" approach [17, 18] relies on the use of Gel'fand-Tsetlin or adapted bases. These allow us to carry out the recursive divide-andconquer approach described above, building the transform efficiently at each level of the subgroup tower. To construct these bases, we need a natural indexing scheme for the representations, and for their matrix elements. Happily, such an indexing scheme is provided by the Bratteli diagram formalism, which we now present. Given a finite group G, let G = Gm > Gm-1 > · · · > G1 > G0 = {1} be a tower of subgroups of length m for G. The corresponding Bratteli diagram, denoted B, is a leveled directed multigraph whose nodes at level i = 0, . . . , m are

in one-to-one correspondence with the (inequivalent) irreducible representations of Gi . For convenience, we refer to vertices in the diagram by the representation with which they are associated. The number of edges from an irreducible representation  of Gi to  of Gi+1 is equal to the multiplicity of  in the restriction of  to Gi . Since there is a unique irreducible representation of the trivial group, a Bratteli diagram for a given tower is in fact a rooted tree. Bratteli diagrams for the cyclic group Z6 and the symmetric group S4 are shown in Figure 1. We now describe how paths in the Bratteli diagram index the rows and columns of each representation, and thus provide a natural set of bases. Each edge, from a node ^ ^  : Gi  U (V ) of Gi to a node  : Gi+1  U (V ) of Gi+1 , represents an embedding of V into V . Thus the edges into  describe a decomposition of V into a direct sum of orthogonal subspaces V , each of which are invariant under the (restricted) action of Gi ; and conversely, the edges out from  correspond to embeddings of V into orthogonal subspaces V . Thus these edges describe how the subspaces acted on by the representations of Gi+1 are decomposed into smaller subspaces acted on by representations of Gi , and conversely how the subspaces of Gi are embedded in the subspaces of Gi+1 . Since the only representation of the trivial group {1} is one-dimensional, composing these edges into paths ^ from the root to a given node   Gi gives a decomposition of V into a direct sum of orthogonal one-dimensional subspaces; but this is tantamount to providing a basis for V . Moreover, since paths from {1} to  consist of paths from {1} to various , composed with paths from  to ^ , where   G j for some G j < Gi , this basis has the following property: for any G j < Gi , there is a partition of the basis vectors into subsets, each of which spans an irreducible G j -invariant subspace. Therefore, in this basis the matrix representation  is block diagonal according to this partition when restricted to G j and, moreover, the blocks corresponding to some  which appears in  with multiplicity greater than 1 are identical. Such a basis is said to be G j -adapted or Gel'fand-Tsetlin. This implies that the number of paths to a node  is equal to d (so, for instance, the Bratteli diagram of an Abelian group is a directed tree). Furthermore, each ordered pair of paths with common endpoint  indexes an irreducible matrix element of , since one path indexes a row and the other indexes a column. Following the divide and conquer approach, the Fourier transform on G = Gm can be written a sum of Fourier transforms on Gm-1 , each of which is translated from a different coset. Specifically, if T  G is a transversal, i.e., a set of representatives for the left cosets of Gm-1 in Gm , we define f : Gm-1  C by f (x) = f (x). Then

781


f^() = (3.2) =

T T

 () 

(x) f (x)

xGm-1

 () · f^ ( |Gm-1 ).

These matrices () are called the "twiddle factors" in analogy with the Cooley-Tukey FFT. Note that the number of terms in this sum is |T | = [Gm : Gm-1 ] = |Gm |/|Gm-1 |, the index of Gm-1 in Gm . As we will see below, the recursion of (3.2) will be greatly simplified by the fact that in the adapted basis, the restricted representations |G j become block diagonal, where the blocks are simply the matrices . 3.2 Strong generating sets and adapted diameters Adapted representations are only part of the story for the construction of efficient Fourier transforms. In general, the twiddle factors () in Equation (3.2) could be an arbitrary matrices of exponential size, so an algorithm which simply performs the sum in (3.2) could be costly. Luckily, under fairly mild assumptions, these twiddle factors can be factored into polylog(|G|) sparse, highly structured matrices, and can therefore be implemented with polylog(|G|) elementary quantum operations. We say that S is a strong generating set for the tower of subgroups {Gi } if S  Gi generates Gi . Say that we have chosen a transversal Ti for each i indexing the cosets of Gi-1 in Gi . Now define

can condition on the i to determine which subspace of  we are in, we can write () as a series of poly(M ) elementary quantum operations where M = maxi mi in (3.3). Therefore, the total number of elementary quantum operators we need to implement () is D × poly(M ). Moreover, if  is itself in a subgroup H > K , and  is adapted to both H and K , then () possesses the same block structure that |H does. This places an upper bound on M , namely the maximum multiplicity with which representations of K appear in restrictions of representations of H . Thus we can minimize M by choosing generators  which (1) are inside subgroups as low on the tower as possible, and (2) centralize subgroups as high on the tower as possible. For instance, for the symmetric group Sn we take the tower to be Sn > Sn-1 > · · · > {1} ,

where Si fixes all elements of {1, . . . , n} greater than i. Let S be the set of pairwise adjacent transpositions ( j, j + 1); each of these is contained in S j+1 and centralizes S j-1 . The maximum multiplicity with which a representation of S j-1 appears in a representation of S j+1 is 2, corresponding to the two orders in which we can remove two cells from a Young diagram. In this case the adapted basis defined by the Bratteli diagram is exactly the Young orthogonal basis, in which each block of (( j, j + 1)) differs from the identity only by a 2 × 2 minor. Since the adapted diameter is easily seen to be O(n2 ), this means that the twiddle factors () can be carried out in O(n2 ) = polylog(|Sn |) elementary quantum operations [3]. We will see in the Di = min{ > 0 :  j (S  Gi ) j  Ti } , next section that a similar situation obtains for a large i.e., the length of words over S  Gi we need to generate class of groups. every representative in T , and define the adapted diameter D = i Di . Then clearly any group element can be 4 Efficient quantum Fourier transforms factored as a series of coset representatives, which in turn We describe our quantum algorithms in this section. As in the classical case, we perform the Fourier transform can be factored as a total of at most D elements of S. Of course, to perform the QFT efficiently we would inductively on the tower of subgroups, using the structure like () to have a simple form for each   S. Given a of the Bratteli diagram to construct the transform at each subgroup K < G, recall that the centralizer of K is the level from the transform at the previous level. Recall that for each level of our tower of subgroups subgroup Z (K ) = {g  G : gk = kg for all k  K }. The G = Gm > Gm-1 > · · · > G0 = {1} we have chosen a following is implicit in the oft-cited lemma of Schur: transversal Ti for the left cosets of Gi-1 in Gi . At the L E M M A 3 . 1 . (Schur, [17, Lemma 5.1]) Let K < G, let beginning of the computation, we represent each group   Z (K ), and let  be a K -adapted representation of G. element g as a product  = m · · · 1 where i  Ti . This Suppose that |K = m1 1 · · · mr r . Then () has the string becomes shorter as we work our way up the tower, form and after having performed the Fourier transform for Gi the remaining string  = m · · · i+1 indexes the coset of (3.3) (GLm1 (C)  Id1 )  · · ·  (GLmr (C)  Idr ) Gi in G in which g lies. At the end of the computation, we have a pair of paths where Ik is the k × k identity matrix and di = di . in the Bratteli diagram, s = s1 · · · sm and t = t1 · · · tm , which Since any unitary operator in GLm (C) can be carried out index the rows and columns of the representations  of G. with poly(m) elementary quantum gates [2], and since we These paths begin empty and grow as we work our way up
5

782


the tower; after having performed the Fourier transform for Gi , the paths p = p1 · · · pi and q = q1 · · · qi of length i index the rows and columns of representations  of Gi . With a compact encoding, one could store  in the same registers as s and t , at each step replacing a coset representative i with a pair of edges si , ti . (This is how Coppersmith's circuit for the QFT over Z2k works; see below.) However, our algorithm is simpler to describe if we double the number of qubits and store  and s, t in separate registers. Padding out , s, and t to length m with zeroes, our computational basis consists of unit vectors of the form  s . m-i | |s,t = m · · · i+1 0i , t1 · · · ti 0m-i 1 · · · si 0 Keep in mind that the basis {|s, t }, where s and t have length i and end in the same representation, is just a permutation of our adapted Gel'fand-Tsetlin basis {|, j, k } ^ for Gi , where  ranges over the representations of Gi and 1  j, k  d index its rows and columns. Therefore, we will sometimes abuse notation by writing f^(s, t ) and f^() j,k for the Fourier transform over Gi indexed in these two different ways. Each stage of the algorithm consists of calculating the Fourier transform over Gi+1 from that over Gi . By induction it suffices to consider the last stage, where we go from H = Gm-1 to G = Gm . Specifically, choose a transversal T of H in G such that every g  G can be written h where   T and h  H . As in (3.2), for each   T we define a function f on H as f (h) = f (h); this is the restriction of f to the coset H , translated into H . After having performed the Fourier transform on H , our state will be = (4.4)
T T

Recall that the matrix f^ ( |H ) is a direct sum of submatrices of the form f^ (), summed over the  appearing in . We will construct f^( |H ) via an embedding operation which reverses the restriction to H , w (4.7) | 
:  appears in |H


A, |

here this "scale factor" is A, = | H | d . |G| d

 |  |

 
s,t of length m-1 ^ (, j,k)H


f^ (s, t ) |s, t f^ () j,k |, j, k .


Our goal is to transform this state into the Fourier basis of G, namely = |0  (4.5) |0 
s,t of length m ^ (, j,k)G


f^(s, t ) |s, t f^() j,k |, j, k .


where |0 occupies the register that held the coset representative  before. As described in Equation (3.2) above, f^ can be written as a sum over contributions from f 's values on each coset H , giving (4.6) f^() =
T

 () · f^ ( |H )

.

Note that  |A, |2 = 1. Thus our algorithm consists of (i.) embedding the  in the appropriate , (ii.) applying the "twiddle factors" (), and (iii.) summing over the cosets of H . However, as discussed above, doing these things efficiently is no simple matter. First, a given  might appear in a given  with an arbitrary change of basis; second, the twiddle factor () could be an arbitrary unitary matrix of exponential size; and finally, if [G : H ] is exponentially large, summing over the cosets will take exponential time unless parallelized in some way. The Bratteli diagram, and the adapted basis it provides, allow us to accomplish (i) and (ii) above with a minimum of trouble. For (i), the embedding operation, note that f^ (s, t ) is nonzero only when s and t end in the same representation  of Gt , i.e., in the same vertex of the diagram. Moreover, recall that the Bratteli diagram indexes an adapted basis in which |H is block-diagonal with the  j as its blocks. This means that the  appear in the  in an extremely simple way: namely, where s and t are extended by appending the same edge e to both. The only change of basis required is to literally pick the matrix elements of  up and place them in the appropriate block in , and we describe below how to do this unitarily. Similarly, when coupled with a strong generating set of small adapted diameter as discussed in Section 3.2, the adapted basis allows us to carry (ii) out efficiently by writing () as a product of a small number of (), each of which has the block-diagonal structure given by Lemma 3.1. For (iii), summing over the cosets, for now we simply take the time to sum over all the cosets serially, paying a cost of [Gi : Gi-1 ] per level as reflected in Theorem 1.1. This makes sense for subgroup towers where the index of each subgroup in the one above it is polynomial, such as the tower for Sn above, and we focus on that case in Section 4.1. However, in Section 4.2 we will see that even when the index of some level of the tower is exponentially large, in some cases we can use the parallelism of quantum mechanics to sum over all the cosets simultaneously, and still achieve an efficient QFT.

783


We adopt the following notation. Given a path s in the Bratteli diagram of length m - 1 or m, denote the representation in which it ends by [s] or [s] respectively. Given a path s = s1 · · · sm-1 , denote its extension s1 · · · sm-1 e as se. We will index the edges of each vertex {1, . . . , k} where k is its out-degree. It will be convenient to carry out this embedding only if the register containing the coset representative is zero, and leave other basis ^ vectors in (T  {0})  H fixed. Then (4.7) becomes | |0 |s,t  |0 e A[s],[se] |se, t e (4.8) U:  |s,t  | |s,t for all   T where the sum is over all outgoing edges e of [s] = [t ]. Note that we have not defined U on the entire space; ^ in particular, since we are moving probability from H to ^ , basis vectors |0 |se,te  (T  {0})  G cannot stay ^ G fixed. As we will see below, it does not matter precisely how U behaves on the rest of the state space, as long ^ as its behavior on H is as described in (4.8). This can be accomplished simply by putting the mth registers of s and t in the superposition e A[s],[se] |e  |e , and for a large class of groups we define below we can prepare this superposition efficiently. We shall focus on group towers for which the Bratteli diagram data can be effectively computed: D E FI N I T I O N 2 . For a group G and a tower of subgroups Gi , let B be the corresponding Bratteli diagram, let Ti be a set of coset representatives at each level, and let S be a strong set of generators for G. Then we say that G is polynomially uniform (with respect to {Gi }, B, {Ti }, and S) if the following functions are computable by a classical algorithm in polylog(|G|) time: (i.) Given two paths s, t in B, whether [s] = [t ]; (ii.) Given a path s in B, the dimension and the out-degree of [s]; (iii.) Given a coset representative   Ti , a factorization of  as a word of polylog(|G|) length in (S  Gi ) .

4.1 Extensions of small index We begin by focusing on groups and towers which are fairly refined, i.e., with roof of Theorem 1.1. This follows by induction since the polynomial indexes at each level. depth of the Bratteli diagram is at most log |G|. F L E M M A 4 . 1 . If G is polynomially uniform with respect to a tower of subgroups where G = Gm and H = Gm-1 and a strong generating set S with adapted diameter D and maximum multiplicity M , then the Fourier transform of G can obtained from the state (4.4) using poly([G : H ] × D × M × log |G|) elementary quantum operations. Proof. First, to carry out the embedding transformation U , we use the classical algorithm to compute the list of edges e and d[se] conditional on s, and thus compute the A, (say, to n digits in poly(n) time). Note that 
7

appears in at most [G : H ] many . We then carry out a series of [G : H ] conditional rotations, each of which rotates the appropriate amplitude A[s],[se] from |0 |s,t to |0 |se,te . Thus U , and therefore U -1 , can be carried out in O([G : H ]) quantum operations. To apply the twiddle factor and sum over the cosets as in (4.6), we use a technique of Beals [3] and carry out the following for-loop. For each   T , we do the following three things: left multiply f^() by ()-1 ; add f^ () to f^(); and left multiply f^() by (). This loop clearly produces T () · f^(), so we just need to show that each of these three steps can be carried out efficiently. Recall that f^() is given in the |s, t basis, where s and t index the row and column of  respectively. To left multiply f^() by (), we apply () to the s register and leave the t register unchanged. Since G is polynomially uniform, a classical algorithm can factor  as the product of D generators i  S, and provide a factorization of each (i ) as the product of poly(M ) many elementary quantum operations, in polylog(|G|) time. This implements () and ()-1 in D × poly(M ) + polylog(|G|) operations. The step "add f^ () to f^()" is slightly more mysterious, and indeed it does not even sound unitary at first. However, as Beals points out, at each point in the loop we add f^ (), which is the Fourier transform of a function with support only on H , to < (-1 ) f^ (), which is the Fourier transform of a function with support only outside H . Thus these two states are orthogonal, and adding two orthogonal vectors can be done unitarily by rotating one vector into the other while fixing the subspace perpendicular to both. Let V be the operation that exchanges | |s,t with |0 |s,t and leaves | |s,t fixed for all   , 0; then Beals showed that this step can be written U -1VU where U is the embedding operator defined in (4.8). We showed earlier that U can be carried out in O([G : H ]) quantum operations, and V is a simply a Boolean operation on the  register. Finally, the for-loop runs |T | = [G : H ] times, and we are done. P

or many families of groups, the maximum index I = maxi [Gi : Gi-1 ], the adapted diameter D, and the maximum multiplicity M are all polylog(|G|). In this case, Theorem 1.1 gives circuits for the QFT of polylog(|G|) size. This includes the following three families of groups: The symmetric groups Sn . As stated above, we take the tower Sn > Sn-1 > · · · > {1}, so I = n = o(log |Sn |). The generators are the adjacent transpositions, so D = O(n2 ) and M = 2. The adapted basis is precisely the Young orthogonal basis.

784


Wreath products G = H Sn for H of size poly(n). These 4.2 Extensions of large index; Coppersmith-type cirgroups arise naturally as automorphism groups of graphs cuits The reader familiar with Coppersmith's circuit [7] obtained by composition [12]. As in [23] the tower is for the QFT over G = Z2n , where H = Z2n-1 , will recall ^ that each Hadamard gate embeds a character   H in two H Sn > H × (H Sn-1 ) > H Sn-1 > · · · > {1} . ^ , applies part of the twiddle factor, and characters   G sums over both cosets of H , all in one operation. This is Then I = max(n, |H |), and taking the adjacent transposiin contrast to the technique of the previous section, which tions and an arbitrary set of log |H | generators for each sums over the cosets serially--and which takes exponenfactor of H gives D = O(n2 log |H |) and M = O(|H |). tial time if, for instance, G is an extension of H by Z p Finally, note that |H | = polylog(|G|). See [17] for details where p is exponentially large. and [15] for discussion on wreath products. For a certain type of extension, we can construct The Clifford groups. The Clifford groups CLn are circuits analogous to Coppersmith's, which use quantum 2 generated by x1 , . . . , xn where xi = 1 and xi x j = -xi x j for parallelism to embed  in the , sum over all the cosets simultaneously, and apply the twiddle factors as well. all i = j [26]. We take the tower Recall that G is a split extension or semidirect product of CLn > CLn-1 > · · · > {1} H by T , written T H, if H G and there is a transverse subgroup T < G so that T  G/H . = for which I = 2, and the generators {x1 , x1 x2 , . . . , xn-1 xn }. Then D = O(n), and since each xi xi+1 centralizes CLi-1 D E FI N I T I O N 3 . Suppose G is a split extension of H by T , we have M = 4. let S be a set of at most log |T | generators for T , and
2

In addition to giving polylog(|G|)-size circuits for suppose that G is polynomially uniform with respect to a these groups, our techniques give the first subexponential- tower of subgroups where G = Gm and H = Gm-1 and a Bratteli diagram B. Then G is a homothetic extension size circuits for the following classical groups: ^ of H by T if (i.) Given   H and   S, define  (h) = The linear groups GLn (q), SLn (q), PGLn (q), and (-1 h); then for every   H , either  = , or the orbit ^ j PSLn (q); the finite groups of Lie type; the Chevalley of q distinct representations  , for 0  j < q where q and Weyl groups. The case of GLn (q) is emblematic of divides the order of , appears among the representations all these families. We have a natural tower: of H given by B; (ii.) For each   S, there is a classical algorithm which runs in polylog(|G|) time which, given a GLn (q) > Pn (q) > GLn-1 (q) × GL1 (q) > GLn-1 (q) > · · · path s in B indexing a row of [s] and an integer j, returns Here Pk (q) is the so-called maximal parabolic subgroup, the size q of 's orbit under conjugation by , and returns j j j consisting of elements of the form a path s that indexes the same row of [s ] =  . w A v T H E O R E M 4 . 1 . If G is a homothetic extension of H by an 0···0 c Abelian group, then the Fourier transform of G can be obtained from the state (4.4) using polylog(|G|) elementary here A  GLk-1 (q), v  Fk-1 , and c  F× , so I = qn-1 . quantum operations . q q Our generators are the block-diagonal matrices with an arbitrary element of GL2 (q) in the i, i - 1 block and all Proof. It is easy to show that a homothetic extension of other diagonal elements equal to 1. Then D = O(n2 ) and H by A × B consists of a homothetic extension of H by M = qO(n) . Analogous factorizations arise for the finite A, followed by a homothetic extension by B. Therefore groups of Lie type as well as the finite unitary groups [18]. it suffices to prove the lemma for homothetic extensions Theorem 1.1 then implies a quantum circuit of size by cyclic groups of prime power order, so without loss of 2 qO(n) for the QFT over these groups. SinceOG| = O(qn ) generality we let T be generated by  of order pz . | l i O(1/n) , which is exp we can write this as |G| ( og |G|) We recall some representation theory from [6, 22]. ^ f q is fixed. On the other hand, the best-known classical Given   H , the stabilizer of  is K = {x  T : x  }, = FFT for these groups [17] has complexity |G| q(n) = and for a homothetic extension we can replace x   = |G|1+(1/n) . Note that for the group families above for with x = . Then K is the subgroup of T of order which we obtain circuits of size polylog(|G|), there are p generated by q where q = pz- , and 's orbit under classical algorithms of complexity |G| polylog(|G|). In conjugation by  is of size q. both cases, it seems that the natural quantum speedup is The representations  in which  appears can be a factor of |G|, modulo polylogarithmic terms; of course, obtained in two steps. First, we extend  to K H by we would like to know if this is the best possible. multiplying  by one of the p characters of K . This yields

785


b  K H where b (q j h) = b ( j) 1h) and b (q j ) = ( bj  p . Since db = d we have A,b = / p , so we embed  in a uniform superposition over the b by appending a uniform superposition of edges 1  e  p where b = e - 1. Combining this with the twiddle factor b gives the unitary transformation (4.9)
a ere H


q j+k

1 | p s, t  k


p e=1

p

(e-1) j

|

se, t e .

Closure under homothetic extensions and metacyclic groups. Theorem 4.1 shows that the set of groups for which circuits of polylog(|G|) size exist is closed under homothetic extensions by Abelian groups. It also generalizes the efficient quantum Fourier transform of Høyer [13] for the metacyclic groups Zq Z p , since these are homothetic extensions of Z p by Zq . Note that the metacyclic groups include the dihedral groups (where q = 2) and the affine groups (where q = p - 1) as special cases. The quaternionic groups. The generalized quaternion group is an extension of H = Z2n by Z2 where 2 is the  element of order 2 in H . Then C() = (2 ) = 1 or i. ¨ ¨ Puschel, Rotteler and Beth [21] gave an efficient quantum Fourier transform for these groups in the case where n is a power of 2. Of course, these groups are extensions of Abelian groups with bounded index, so Lemma 4.1 already provides an efficient QFT for them. Metabelian groups. Even if an extension is neither homothetic nor of polynomial index, we can still construct an efficient QFT if we can apply arbitrary powers of C() in polynomial time--for instance, if C() is of polynomial size, which is true whenever all the representations of H are of polynomial size. This includes the metabelian groups, i.e., split extensions of Abelian groups by Abelian groups, since all the representations of H are one-dimensional. We discuss this further in the full paper.

we write the power of  in two registers 0  j < p nd 0  k < q. Then this operation Fourier transforms the first register over Z p and transfers the result to the mth register of s and t . This transform can be carried out with O(log p log log p ) = O(log |G| log log |G|) elementary quantum operations [10, 16]. Note that p takes at most log |G| different values, and can be obtained from the classical algorithm which computes q. ^ If K = T , then the   G containing  are simply the extensions b and we're done. If K < T , i.e., if q > 1, we carry out a second step as follows. Each b appears in a single induced representation b whose restriction to K H is the direct product of all the representations in 's i q-1 orbit, times b : that is, b |H = b i=0  . The twiddle factor b (k ) is then a permutation matrix which cycles these p blocks k times, with an additional phase change bk . This gives the unitary transformation pz (4.10) 
k

The general case. In general, Abelian extensions can be slightly more complicated; consider extensions by Z p . If  is isomorphic to , rather than equal to it,  induces an additional twiddle factor C() which changes k Since s can be calculated by the classical algorithm in p bk 's basis [22]. This occurs, for instance, if  is an polylog(|G|) time, and since it is easy to implement  pz element of H other than the identity, in which case the y with phase shifts 2 zb for 0 < y < log2 k conditioned on cyclic group generated by  is not transverse to H and the p the binary digits of bk, we can perform this transformation extension is not split; then C() is a pth root of ( p ). with polylog(|G|) elementary quantum operations. Composing (4.9) and (4.10) transforms the state (4.4) to the 5 Conclusion and open problems Fourier transform (4.5) over G. R We have shown that a general technique for constructing efficient classical fast Fourier transforms on groups-- elation to Coppersmith's circuit. Let  be a generator separation of variables using an adapted basis--can be of G = Z2n . Then G is an extension of H = Z2n-1 with carried over to the quantum context, producing circuits 2 transversal {1, }. Since  = 1,  induces an additional of polylog(|G|) size for a wide variety of groups, and of b 2 twiddle factor C() = b ( ) = 2n . Similarly, the subexponential size for the linear groups mod q. additional phase shift in (4.10) is due to the fact that While separation of variables is one of the most Z pz is not a split extension of Z p . In Coppersmith's general techniques for classical FFTs, it is not the only circuit, C() appears as a set of conditional phase shift one. It is possible to use the Bratteli diagram in a gates, conditioned on the low-order bit of j. Finally, the more precise fashion, looking for redundancy and sparsity Hadamard gate in Coppersmith's circuit is precisely the on the level of individual matrix elements. This finer operation (4.9) in the case p = 2, = 1 and q = 1, in a analysis is responsible for the fastest known classical compact encoding where we use the same qubit register FFTs for the groups SL2 (q), as well as Sn and its wreath for e (the high-order bit of the frequency) as for  (the products [19]. It would be interesting to explore adapting low-order bit of the time). these techniques to the quantum setting. | k (e-1)k |0 s e, t e se, t e   pz .
9

786


Acknowledgements. The authors gratefully ac- [13] knowledge the support of the National Science Foundation under grants CCR-0220264, EIA-0218443, and CCR-0093065, and the hospitality of the Mathemati- [14] cal Sciences Research Institute and Dartmouth College, where portions of this research were completed. We also thank Tracy Conrad and Sally Milius for their support. References
[1] ACM, editor. Proceedings of the 33rd Annual ACM Symposium on Theory of Computing: Hersonissos, Crete, Greece, July 6­8, 2001, New York, NY, USA, 2001. ACM Press. [2] A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor, T. Sleator, J. Smolin, and H. Weinfurter. Elementary gates for quantum computation. Physical Revew A, pages 3457­, 1995. [3] Robert Beals. Quantum computation of Fourier transforms over symmetric groups. In ACM, editor, Proceedings of the twenty-ninth annual ACM Symposium on the Theory of Computing: El Paso, Texas, May 4­6, 1997, pages 48­53, New York, NY, USA, 1997. ACM Press. [4] Thomas Beth. On the computational complexity of the general discrete Fourier transform. Theoret. Comput. Sci., 51(3):331­339, 1987. [5] Michael Clausen. Fast generalized Fourier transforms. Theoret. Comput. Sci., 67(1):55­63, 1989. [6] A. H. Clifford. Representations induced in an invariant subgroup. Annals of Mathematics, 38:533­550, 1937. [7] D. Coppersmith. An approximate Fourier transform useful in quantum factoring. Technical Report RC19642, IBM, 1994. Quantum Physics e-Print Archive, quantph/0201067. [8] Persi Diaconis and Daniel Rockmore. Efficient computation of the Fourier transform on finite groups. J. Amer. Math. Soc., 3(2):297­332, 1990. [9] Michelangelo Grigni, Leonard Schulman, Monica Vazirani, and Umesh Vazirani. Quantum mechanical algorithms for the nonabelian hidden subgroup problem. In ACM [1], pages 68­74. [10] L. Hales and S. Hallgren. An improved quantum Fourier transform algorithm and applications. In IEEE, editor, 41st Annual Symposium on Foundations of Computer Science: proceedings: 12­14 November, 2000, Redondo Beach, California, pages 515­525, 1109 Spring Street, Suite 300, Silver Spring, MD 20910, USA, 2000. IEEE Computer Society Press. IEEE Computer Society Order Number PR00850. [11] Sean Hallgren, Alexander Russell, and Amnon Ta-Shma. Normal subgroup reconstruction and quantum computation using group representations. In ACM, editor, Proceedings of the thirty second annual ACM Symposium on Theory of Computing: Portland, Oregon, May 21­23, [2000], pages 627­635, New York, NY, USA, 2000. ACM Press. [12] Frank Harary. Graph Theory. Addison-Wesley, 1969. [15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24] [25]

[26]

[27]

Peter Høyer. Efficient quantum transforms. Technical Report quant-ph/9702028, Quantum Physics e-Print Archive, 1997. ´ ´´ Gabor Ivanyos, Frederic Magniez, and Miklos Santha. Efficient quantum algorithms for some instances of the non-abelian hidden subgroup problem. In Proceedings of the Thirteenth Annual ACM Symposium on Parallel Algorithms and Architectures, pages 263­270, Heraklion, Crete Island, Greece, 4-6 July 2001. ACM. Adalbert Kerber. Representations of Permutation Groups I, II, volume 240 and 495 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 1971 and 1975. A. Yu. Kitaev. Quantum measurements and the abelian stabilizer problem. Technical Report quant-ph/9511026, Quantum Physics e-Print Archive, 1995. David K. Maslen and Daniel N. Rockmore. Adapted diameters and the efficient computation of Fourier transforms on finite groups. In Proceedings of the Sixth Annual ACMSIAM Symposium on Discrete Algorithms, pages 253­262, San Francisco, California, 22­24 January 1995. David K. Maslen and Daniel N. Rockmore. Separation of variables and the computation of Fourier transforms on finite groups, I. Journal of the American Math Society, 10(1):169­214, 1997. David K. Maslen and Daniel N. Rockmore. The CooleyTukey FFT and group theory. Notices Amer. Math. Soc., 48(10):1151­1160, 2001. Cristopher Moore, Daniel Rockmore, Alexander Russell, and Leonard Schulman. The value of basis selection in Fourier sampling: Hidden subgroup problems in affine groups. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2004. ¨ ¨ Markus Puschel, Martin Rotteler, and Thomas Beth. Fast quantum Fourier transforms for a class of non-abelian groups. In Proceedings of Applied Algebra, Algebraic Algorithms, and Error-Correcting Codes (AAECC-13), volume 1719 of Lecture Notes in Computer Science, pages 148­159. Springer-Verlag, 1999. Daniel Rockmore. Fast Fourier analysis for Abelian group extensions. Advances in Applied Mathematics, 11:164­ 204, 1990. Daniel Rockmore. Fast Fourier transforms for wreath products. J. Applied and Computational Harmonic Analysis, 2:279­292, 1995. Jean-Pierre Serre. Linear Representations of Finite Groups. Springer-Verlag, 1977. Peter W. Shor. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Journal on Computing, 26(5):1484­1509, October 1997. Barry Simon. Representations of Finite and Compact Groups, volume 10 of Graduate Studies in Mathematics. American Mathematical Society, 1996. John Watrous. Quantum algorithms for solvable groups. In ACM [1], pages 60­67.

787