Detecting short directed cycles using rectangular matrix multiplication and dynamic programming Raphael Yuster Uri Zwick Abstract We present several new algorithms for detecting short fixed length cycles in digraphs. The new algorithms utilize fast rectangular matrix multiplication algorithms together with a dynamic programming approach similar to the one used in the solution of the classical chain matrix product problem. The new algorithms are instantiations of a generic algorithm that we present for finding a directed Ck , i.e., a directed cycle of length k , in a digraph, for any fixed k 3. This algorithm partitions the prospective Ck 's in the input digraph G = (V , E ) into O(logk V ) classes, according to the degrees of their vertices. For each cycle class we determine, in O(E ck log V ) time, whether G contains a Ck from that class, where ck = ck ( ) is a constant that depends only on , the exponent of square matrix multiplication. The search for cycles from a given class is guided by the solution of a small dynamic programming problem. The total running time of the obtained deterministic algorithm is therefore O(E ck logk+1 V ). For C3 , we get c3 = 2 /( + 1) < 1.41 where < 2.376 is the exponent of square matrix multiplication. This coincides with an existing algorithm of [AYZ97]. For C4 we get c4 = (4 - 1)/(2 + 1) < 1.48. We can dispense, in this case, of the polylogarithmic factor and get an O(E (4-1)/(2+1) ) = o(E 1.48 ) time algorithm. This improves upon an O(E 3/2 ) time algorithm of [AYZ97]. For C5 we get c5 = 3 /( + 2) < 1.63. The obtained running time of O(E 3/(+2) log6 V ) = o(E 1.63 ) improves upon an O(E 5/3 ) time algorithm of [AYZ97]. Determining ck for k 6 is a difficult task. We conjecture that ck = (k + 1) /(2 + k - 1), for every odd k . The values of ck for even k 6 seem to exhibit a much more complicated dependence on . of Mathematics, University of Haifa at Oranim, Tivon 36006, Israel. E­mail: raphy@research.haifa.ac.il Department of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel. E­mail: zwick@p ost.tau.ac.il Department 1 Intro duction We present several improved algorithms for detecting fixed length (directed) cycles in digraphs. The algorithms utilize fast square and rectangular matrix multiplication algorithms due to Coppersmith and Winograd [CW90], Coppersmith [Cop97] and Huang an Pan [HP98], together with a dynamic programming approach. Although the algorithms are given explicitly, their complexity analysis is quite a difficult task. Determining this complexity requires, in general, the solution of a huge number of (constant size) linear programs. Although our algorithm use fast rectangular matrix multiplication algorithms, we express most of our time bounds, for simplicity reasons, in terms of , the exponent of fast square matrix multiplications. The best bound currently available on is < 2.376, obtained by Coppersmith and Winograd [CW90]. This is done by reducing each rectangular matrix product into a collection of smaller square matrix products. Slightly improved bounds can be obtained by using, directly, the best available rectangular matrix multiplication algorithms of Coppersmith [Cop97] and Huang an Pan [HP98]. As explained in the abstract, the new algorithms are instantiations of a generic algorithm that we present for finding a directed Ck , i.e., a directed cycle of length k , in a digraph, for any fixed k 3. This algorithm partitions the prospective Ck 's in the input digraph G = (V , E ) into a O(logk V ) classes 1 , according to the degrees of their vertices. For each cycle class we determine, in O(E ck log V ) time, whether G contains a Ck from that class, where ck = ck ( ) is a constant that depends only on . The search for cycles from a given class is guided by the solution of a small dynamic programming problem. For k = 3, we `rediscover' an O(E 2/(+1) ) algorithm for finding triangles obtained by Alon et al. [AYZ97]. For k = 4, 5 we obtain algorithms that improve upon the purely combinatorial algorithms for finding C4 's and C5 's presented in [AYZ97]. This was stated 1 Throughout this pap er we write V and E , instead of |V | and |E |, when these terms app ear inside a big-O notation. Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688 254 there as an open problem. We conjecture that the algorithms that we obtain for k 6 also improve on the combinatorial algorithms of [AYZ97], at least when is close enough to 2. (Many researchers believe that = 2 + o(1).) Our concrete results for k = 4, 5 are: Theorem 1.1. There exists an algorithm that given a digraph G = (V , E ) determines whether G has a C4 , and outputs one if it does, in O(E (4-1)/(2+1) ) = o(E 1.48 ) time. Theorem 1.2. There exists an algorithm that given a digraph G = (V , E ) determines whether G has a C5 , and outputs one if it does, in O(E 3/(+2) log6 V ) = o(E 1.63 ) time. A summary of all available algorithms for finding C4 's and C5 's in digraphs is given in Table 1. Alon et al. [AYZ97] obtained two algorithms for finding C4 's. The first has a running time of O(E 3/2 ), and the second a running time of O(V ). The new O(E (4-1)/(2+1) ) = o(E 1.48 ) is clearly better than the old O(E 3/2 ) algorithm. It is better than the O(V ) algorithm when the graph is sufficiently sparse. A recent algorithm of Eisenbrand and Grandoni [EG03] detects C4 's in O(E 2-2/ V 1/ ) time. This algorithm is inferior to the algorithm of Theorem 1.1 for sparse graphs and inferior to the O(V ) algorithm for dense graphs, but is better than the two of them in some intermediate density interval. Using directly a fast rectangular matrix multiplication algorithm, the running time of the algorithm of Theorem 1.1 can be slightly improved to O(E 1.474 ). The algorithm of Theorem 1.2 is clearly faster than the O(E 5/3 ) algorithm Alon et al. [AYZ97]. It is faster than the O(V ) algorithm of [AYZ97] when |E | = o(V (+2)/3 ). Determining the exact value of ck for k 6 is a difficult task. For odd k , it is not difficult to show that ck (k + 1) /(2 + k - 1). We conjecture that for odd k we in fact have ck = (k + 1) /(2 + k - 1). The values of ck for even k 6 seem to exhibit a much more complicated dependence on . Numerical experiments suggest that c6 = (10 - )/(7 - ) < 1.6488, when is near 2.376. This, however, is not true for every . In this extended abstract we concentrate on obtaining algorithms whose running time can be expressed solely in terms of |E |. Our approach can be easily extended, however, to yield possibly improved algorithms whose running times are expressed in terms of both |V | and |E |, like the O(V 1/ E 2-2/ ) time C4 algorithm of Eisenbrand and Grandoni [EG03]. The details will appear in the full version of the paper. The rest of this paper is organized as follows. In Section 2 we present, as a warm-up, the O(E 2/(+1) ) time algorithm of [AYZ97] for detecting C3 's. In Section 3 we give a direct proof of Theorem 1.1. This section is quite technical and may be skipped in a first reading. The concrete algorithm for detecting C4 's presented in this section partitions the cycles into only a finite number of classes, thus saving a polylogarithmic factor in the running time. In Section 4 we present our general procedure for finding a Ck . In Section 5 we analyze the algorithm obtained for k = 5 and prove Theorem 1.2. In Section 6 we discuss our conjectures for k 6. The final section contains some concluding remarks. Throughout the rest of this paper we use the notation (a, b, c) to denote the exponent of the multiplication of an N a × N b matrix by an N b × N c matrix, where a, b, c are positive constants; namely, the number of arithmetic operations required to perform this multiplication is O(N (a,b,c) ). It is not difficult to see (see, e.g., Huang and Pan [HP98]), that (a, b, c) a + b + c - (3 - ) min{a, b, c}, where = (1, 1, 1) is the exponent of square matrix multiplication. 2 A warm up: Finding triangles (C3 's) To put the results obtained in this paper in context we use this short section to remind the reader of the very simple O(E 2/(+1) ) algorithm of [AYZ97] for finding triangles. Implementing our general procedure in the case k = 3 coincides with this algorithm. Let G = (V , E ) be the input graph. Set = |E |(-1)/(+1) . Let H = {v V | deg(v ) } be the set of high degree vertices of the graph. (For the rest of this paper the notations deg(v ) or degree refer to the sum of the indegree and the outdegree of a vertex.) Clearly |H | 2|E |/. Let L = V - H be the set of low degree vertices. A triangle consisting of three high degree vertices can be easily found, using matrix multiplication, in O(H ) = O((E /) ) = O(E 2/(+1) ) time. A triangle containing a low degree vertex can be easily found, without matrix multiplication, in O(E ) time, as this is a bound on the number of paths of length two that pass through a low degree vertex. As O(E ) = O(E 2/(+1) ), and as each triangle is of one of these two types, the result follows. 3 A more complicated case: Finding C4 's In this section we prove Theorem 1.1. Denote = 2( - 1)/(2 + 1) and let = |E | . We partition the vertex set V into three parts. Let H denote the vertices having degree greater than in G. Let M denote he t vertices having degree at most and greater than . Let L = V \ (H M ). Clearly, |H | < 2|E |/ and 255 Cycle C4 C5 New algorithm O(E (4-1)/(2+1) ) = o(E 1.48 ) ~ O(E 3/(+2) ) = o(E 1.63 ) Existing algorithms O(E 3/2 ) [AYZ97], O(E 2-2/ V 1/ ) [EG03], O(V ) [AYZ97] O(E 5/3 ) [AYZ97], O(V ) [AYZ97] Table 1: New and old algorithms for detecting C4 's and C5 's. |M | < 2|E |/ . We distinguish between five different it is straightforward to find a directed path of length 2 from z to x in O(E ) time. Hence, finding a C4 of type types of C4 . (iii) requires O(H + E 1+ ) = O(E (4-1)/(2+1) ) time. (i) The C4 whose vertices are all in H . In order to detect a C4 of type (iv) we proceed as (ii) The C4 with two opposite vertices in M L. follows. Let A1 denote the rectangular matrix with |M | (iii) The C4 with only one vertex in M L. rows and |H | columns where A1 (x, y ) is the number (iv) The C4 with only two consecutive vertices in M L, of directed paths of length 2 from x M to y H at least one of them being from M . and that pass through a vertex of H . Clearly, A1 is (v) The C4 with only two consecutive vertices in L. the result of multiplying an |M | × |H | matrix with an Clearly, any C4 in G is of one of these types. We show |H | × |H | matrix. Hence, using O(M /H ) square matrix -1 ) time. that the running time required to detect a C4 of a given multiplications we can generate A1 in O(M H (4 -1)/(2 +1) Similarly, we can generate the matrix A2 with |H | type is O(E ). We can detect a C4 of type (i) using the O(V ) rows and |M | columns where A2 (y , x) is the number of algorithm from [AYZ97] mentioned in the introduction directed paths of length 2 from y H to x M and that in O(H ) time. Since |H | < 2|E |/ = 2|E |1- pass through a vertex of H . Note that A2 is the result of we can detect a C4 of type (i) in O(E (1-) ) multiplying an |H |×|H | matrix with an |H |×|M | matrix and hence can be generated in O(M H -1 ) time. Now, O(E (4-1)/(2+1) ) time. 1+ directed paths of Any C4 of type (ii) is composed of two directed as in case (iii) there are at most |E | paths of length 2 with an intermediate vertex in M L. the form (w, u, x) where w H , u M L and x M 1+ ) time. For each As in [AYZ97] we can detect all such directed 2-paths in and they can be generated in O(E O(E ) time, and use radix sort to sort them according such path we only need to check whether A1 (x, w) > 0 to their start and end vertices in linear time (namely, and, as in the previous case, if A1 (x, w) > 0 we can find in O(E ) time). It is then straightforward to check a path of length 2 from x to w in O(E ) time. Similarly, 1+ directed paths of the form whether there are two such directed paths of the form there are at most |E | u - x - v and v - y - u where x = y in linear time. (x, u, w) where w H , u M L and x M and 1+ ) time. For each such Thus, we can detect cycles of type (ii) in O(E ) = they can be generated in O(E 1+ (4 -1)/(2 +1) path we only need to check whether A2 (w, x) > 0. The O(E ) = O(E ) time. In order to detect a C4 of type (iii) we use the overall running time needed to find a C4 of type (iv) is following idea from [EG03]. Let A denote the square therefore matrix of order |H |, with A(x, y ) equal to the number O(E 1+ + M H -1 ) = O(E 1+ + E 1-/2 E (1-)(-1) ) of directed paths of length two from x H to y H = O(E (4-1)/(2+1) ) . that pass through a vertex of H . Clearly, A is the result of multiplying two square matrices of order |H | In order to detect a C4 of type (v) we first throw and hence A can be computed in time O(H ). Notice away the vertices of M from the graph, as they are that there are at most |E | = |E |1+ directed paths of irrelevant. Thus, in the new graph the vertices of L length 2 that begin and end in vertices of H and their have maximum indegree or outdegree at most 1/2 = middle vertex is from M L. These paths can easily |E |/2 . Thus, there are at most |E |1+ directed paths be traversed in O(E 1+ ) time by simply traversing the of length 3 that begin and end in vertices of H and edges of G one by one, and noticing that for each edge their middle vertices are from L. These paths can (x, y ), if x M L and y H there are at most edges be generated in O(E 1+ ) time by traversing all edges of the form (z , x) where z H , and if x H and y of the graph, and for each edge (u, u ) where both M L there are at most edges of the form (y , z ) where u, u L we can generate all possible paths of length z H . For each such path, say, (x, y , z ) we only need to 3 of the form (w, u, u , w ) where w, w H . Notice check whether A(z , x) > 0. Furthermore, if A(z , x) > 0 that there are at most |E |/2 choices for w and at most 256 |E|/2 choices for w and hence there are at most |E | such paths for each (u, u ). Now, for each generated path (w, u, u , w ) we check in constant time whether (w , w) E (we can prepare the adjacency matrix of the subgraph induced by H in O(E + H 2 ) time in the beginning of the algorithm). The overall running time required for detecting C4 of type (v) is therefore O(E 1+ ) = O(E (4-1)/(2+1) ). Notice that in one of its bottleneck phases, our algorithm computes the product of an |M | × |H | matrix with an |H | × |H | matrix, and the product of an |H | × |H | matrix with an |H | × |M | matrix using O(M /H ) square matrix multiplications. This requires O(M H -1 ) time. Since |H | < 2|E |1- and |M | < 2|E |1-/2 we can use rectangular matrix multiplication and perform this phase in O(E (1-)(1,1,(2-)/(2-2) ) time which is slightly better than O(E 1+ ). In fact, we can choose which satisfies 2- (1 + ) = (1 - ) (1, 1, ) 2 - 2 and the overall running time would still be O(E 1+ ). Using a computer program that implements the formula from [HP98] we have found that = 0.474 suffices. 4 The general case: Finding Ck 's In this section we describe a generic algorithm for finding a Ck in a directed graph, for any fixed k 3. The C3 algorithm of Section 2 partitioned the vertices into two degree classes. The C4 algorithm of the previous section used three degree classes. It turns out that to obtain improved algorithms for finding larger Ck 's, we need an unbounded number of degree classes. For every 0 i < log |V |, let Vi = {v V | 2i deg(v ) < 2i+1 } Clearly |Vi | |E |/2i-1 . There are therefore log |V | vertex classes, and the Ck 's of the input graph can be classified, according to the classes of their ordered set of vertices, into O(logk V ) cycle classes. The algorithm will consider each such class separately. This will only increase the running time of the algorithm by a polylogarithmic factor. We are usually only interested in finding simple Ck 's, i.e., Ck 's that do not contain repeated vertices. We can, however, use the color coding technique of [AYZ95] to reduce the problem of finding a simple Ck in a given input graph into the problem of finding a non-necessarily simple Ck in a collection of O(log V ) graphs. For completeness we describe a simple randomized version of the color coding technique that is sufficient for our purposes. We simply choose a random coloring c : V {0, 1, . . . , k - 1} of the vertices of the graph and construct the graph Gc = (V , Ec ), where Ec = { (u, v ) E | c(v ) - c(u) 1 (mod k )}. Clearly, any Ck in Gc is simple. Also, if G contains a simple Ck , then this Ck is present in Gc with probability 1/k k-1 . As k is assumed to be fixed, this is enough for our purposes. It is possible to derandomize this technique. This is done by constructing, deterministically, a sequence c1 , c2 , . . . of O(log V ) colorings such that if G contains a simple Ck , then at least one of the subgraphs Gci also contains one. For the details the reader is referred to [AYZ95]. In the sequel we can therefore ignore the simplicity constraint. Let 0 f0 , f1 , . . . , fk-1 < log |V |. How do we find a cycle v0 , v1 , . . . , vk-1 in the graph for which vi Vfi , for 0 i < k ? Let A be the adjacency matrix of the input graph. For 0 i, j < log |V |, let Ai,j be the |Vi | × |Vj | submatrix of A representing the edges directed from Vi to Vj . It should be noted that we sometimes prefer to represent A or Ai,j as a sparse matrix, e.g., with adjacency lists. Clearly, the graph contains a cycle of the required form if and only if the matrix obtained as a result of the Boolean chain product Af0 ,f1 Af1 ,f2 · · · Afk-2 ,fk-1 Afk-1 ,f0 contains a 1 on its diagonal. Or, more efficiently, if there exist 0 i < j < k such that the matrix Afi ,fi+1 · · · Afj-1 ,fj (Afj ,fj+1 · · · Afi-1 ,fi )T , contains a 1 anywhere. (In the above expression indices are interpreted modulo k , A B is the matrix obtained by logically anding the corresponding elements of the matrices A and B , and AT is the transpose of the matrix A.) This immediately reminds us of the classical matrix chain product problem, one of the most famous examples used to explain the dynamic programming technique (see, e.g., [CLRS01]), with several twists. In the classical problem, the rectangular matrix products are performed using the naive algorithm, i.e., the cost of multiplying an a × b matrix by a b × c matrix is abc. Using a standard reduction of rectangular matrix products to square matrix products (see, e.g., Huang and Pan [HP98]), the cost of such a product is reduced to abc/ min{a, b, c}3- , where is the exponent of square matrix multiplication. More importantly, as noted earlier, the matrices that we are considering are sometimes sparse. This can be used, in certain cases, to speed up the computation. As we are only interested in the asymptotic complexity of the computation, it is convenient to switch to a `logarithmic scale' and consider all quantities as powers of |E |. In particular, let di = fi / log2 |E |, so that |E |di = 2fi , for 0 i < k . Notice that 0 di 1. Let Ck (d0 , d1 , . . . , dk-1 ) be such that 257 O(E Ck (d0 ,d1 ,...,dk-1 ) ) is a bound on the number of operations needed to find a Ck of the class we are considering. Let Pi,j (di , . . . , dj ) be such that O(E Pi,j (di ,...,dj ) ) is a bound on the complexity of computing the product Afi ,fi+1 . . . Afj-1 ,fj . The following two lemmas, the first of which using the dynamic programming spirit, establish suitable values for Pi,j (di , . . . , dj ) and Ck (d0 , d1 , . . . , dk-1 ). To keep the following formulae concise, we omit the arguments of expressions of the form Pi,j . Indices here, and in what follows, are considered modulo k . use radix sort to sort the first representation in linear time and the transpose of the second representation in linear time. We can then find whether Afi ,fi+1 · · · Afj-1 ,fj (Afj ,fj+1 · · · Afi-1 ,fi )T , contains a 1 anywhere in linear time. Thus, Ck (d0 , d1 , . . . , dk-1 ) min0i : Ck (d0 , d1 , . . . , dk-1 ) = Clearly, P4,0 = 1. On the other hand, P0,4 min{ min max{ Pi,j , Pj,i } , min 2 - di } max{P0,2 , P2,4 , M (1 - d0 , 1 - d2 , 1 - d4 )}. Now, P0,2 0i and d4 : As in the preA simple argument shows that the above six cases vious case, P0,3 (1 - ). On the other handle all possible cycle types. hand, P3,0 P3,4 + d4 1 + . Thus, C (d0 , d1 , d2 , d3 , d4 ) max{ (1 - ), 1 + } = 3 /(2 + ). 6 Conjectures for k 6 Cycles of typ e (d0 , d1 , d2 , d3 , d4 ) where d0 , d1 , d2 > Based on extensive numerical experimentations, we and d3 , d4 : As in the previous cases, P0,2 conjecture that (1 - ). On the other hand, P2,0 P2,4 + d4 Conjecture 6.1. P2,3 + d3 + d4 1 + 2 . Thus, C (d0 , d1 , d2 , d3 , d4 ) 13 max{ (1 - ), 1 + 2 } = 3 /(2 + ). 10-3 4+4 if 2 6 22-4 Cycles of typ e (d0 , d1 , d2 , d3 , d4 ) where d1 , d3 , d4 : 13 9 17-4 if 6 4 P0,2 P0,1 + d1 1 + . On the other hand, 11 -2 if 9 176 c6 = P2,0 P2,4 + d4 P2,3 + d3 + d4 1 + 2 . Thus, 4 4+5 10- C (d0 , d1 , d2 , d3 , d4 ) max{1 + , 1 + 2 } = 3 /(2 + ). 7- if 176 5 2 Cycles of typ e (d0 , d1 , d2 , d3 , d4 ) where d0 , d1 > 5 5 if 2 3 3 and d2 , d3 , d4 : Let = - d2 . We have three subcases. If d0 > + and d1 > + then Our numerical experiments lead us to conjecture consider P0,2 and P2,0 . Clearly, P2,0 P2,4 + d4 that for all odd k , the hardest case is that of regular P2,3 + d3 + d4 1 + 2 . On the other hand, P0,2 M (1 - d0 , 1 - d1 , 1 - d2 ) M (1 - - , 1 - - , 1 - graphs. It is an easy exercise to maximize C (, , . . . , ). + ) M (1 - , 1 - , 1 - ) = (1 - ). Thus, It is maximized when = ( - 1)/( + (k - 1)/2) and C (d0 , d1 , d2 , d3 , d4 ) max{1 + 2, (1 - )} = 3 /(2 + the value is then (k + 1) /(2 + k - 1). We therefore ). Otherwise, if d0 + then consider P1,4 and P4,1 . have: Clearly P1,4 P1,3 + d3 P1,2 + d2 + d3 1 + 2 . Conjecture 6.2. For al l odd k 3, On the other hand, P4,1 P4,0 + d0 1 + + . Thus, ck = (k + 1) /(2 + k - 1) . C (d0 , d1 , d2 , d3 , d4 ) max{1+2, 1+ + } = 3 /(2+ ). Otherwise, if d1 + then consider P0,2 and P2,0 . In particular, the algorithm from Section 4 finds a Ck , Clearly, P0,2 P0,1 + d1 1 + + . On the other ~ for an odd k , in O(E (k+1)/(2+k-1) ) time. hand, P2,0 P2,4 + d4 P2,3 + d3 + d4 1 + 2 . Thus, C (d0 , d1 , d2 , d3 , d4 ) max{1+2, 1+ + } = 3 /(2+ ). Sections 2 and 5 show that Conjecture 6.2 holds for k = 2-2/(k+1) ) time algorithm for Cycles of typ e (d0 , d1 , d2 , d3 , d4 ) where d0 , d1 , d3 > 3, 5. A combinatorial O(E and d2 , d4 : We may assume d4 < d2 since the detecting odd cycles of length k is presented in [AYZ97]. other case is symmetric. Let = - d2 . We have It is easy to see that if Conjecture 6.1 is true, then our four subcases. If d0 + then consider P1,3 and algorithm improves on this algorithm if < 2k /(k - 1). P3,1 . Clearly, P1,3 1 + d2 1 + , and P3,1 (Many believe that = 2 + o(1).) For even k , our numerical experiments show that 1 + d4 + d0 1 + ( - ) + ( + ) = 1 + 2 . Thus, C (d0 , d1 , d2 , d3 , d4 ) max{1 + , 1 + 2 } = 3 /(2 + ). regular graphs are not the worst case. This is provably Otherwise, if d1 + then consider P0,3 and P3,0 . so for k = 4, 6, 8. In fact, we believe, as expressed Clearly, P0,3 1 + d1 + d2 1 + ( + )+( - ) = 1 + 2 . in Conjecture 6.1, that ck is expressed as different On the other hand, P3,0 1 + d4 1 + . Thus, functions of for different values of for all even 2-2/k ) time algorithm C (d0 , d1 , d2 , d3 , d4 ) max{1 + 2, 1 + } = 3 /(2 + ). k 6. A combinatorial O(E Otherwise, if d3 + then consider P0,2 and P2,0 . for detecting even cycles of length k is presented in Clearly, P0,2 M (1 - d0 , 1 - d1 , 1 - d2 ) M (1 - - [AYZ97]. We conjecture that, for every even k , we have , 1 - - , 1 - + ) M (1 - , 1 - , 1 - ) (1 - ). On ck < 2 - 2/k , if is sufficiently close to 2. the other hand, P2,0 1 + d3 + d4 1 + ( + )+( - ) = 1 + 2 . Thus, C (d0 , d1 , d2 , d3 , d4 ) max{ (1 - ), 1 + 7 Concluding remarks 2 } = 3 /(2 + ). Otherwise, d0 , d1 , d3 are all greater Determining the values of ck , for k 6 is an interesting than + so consider P0,3 and P3,0 . Clearly, P0,3 open problem. As we said, we suspect the answer for max{P0,1 , P1,3 , M (1 - d0 , 1 - d1 , 1 - d3 )}. Now, P0,1 = 1, even values of k to be quite complicated. It would be P1,3 1 + d2 1 + and M (1 - d0 , 1 - d1 , 1 - d3 ) nice, at first, to obtain a proof of Conjecture 6.1. Fast rectangular matrix multiplication is used for M (1 - , 1 - , 1 - ) = (1 - ). Thus, P0,3 (1 - ). On the other hand, P3,0 1 + d4 1 + . Thus, speeding up graph algorithms also by Zwick [Zwi02], 259 Demetrescu and Italiano [DI00], and by Kratsch and Spinrad [KS03]. Finding other applications of rectangular matrix multiplications, or of the techniques developed here, is also an interesting problem. References [AYZ95] N. Alon, R. Yuster, and U. Zwick. Color-coding. Journal of the ACM, 42:844­856, 1995. [AYZ97] N. Alon, R. Yuster, and U. Zwick. Finding and counting given length cycles. Algorithmica, 17:209­ 223, 1997. [CLRS01] T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C. Stein. Introduction to algorithms. The MIT Press, second edition, 2001. [Cop97] D. Coppersmith. Rectangular matrix multiplication revisited. Journal of Complexity, 13:42­49, 1997. [CW90] D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation, 9:251­280, 1990. [DI00] C. Demetrescu and G. F. Italiano. Fully dynamic transitive closure: breaking through the O(n2 ) barrier. In Proc. of 41st FOCS, pages 381­389, 2000. [EG03] F. Eisenbrand and F. Grandoni. Detecting directed 4-cycles still faster. Information Processing Letters, 87(1):13­15, 2003. [HP98] X. Huang and V.Y. Pan. Fast rectangular matrix multiplications and applications. Journal of Complexity, 14:257­299, 1998. [KS03] D. Kratsch and J. Spinrad. Between O(nm) and O(n ). In Proc. of 14th SODA, pages 709­716, 2003. [Zwi02] U. Zwick. All-pairs shortest paths using bridging sets and rectangular matrix multiplication. Journal of the ACM, 49:289­317, 2002. 260