Dimension Reduction for Ultrametrics
Yair Bartal Manor Mendel

Abstract
We prove that an ultrametric on n p oints can b e emb edded in d with distortion at most 1 + , and d = O(-2 log n). p This b ound matches the b est known b ound for the sp ecial case of an equilateral space.

Proposition 1.1. For any 1  p < , and any 0 <  < 1, any n point ultrametric can be efficiently (1 + )-embedded in d , where d = p O( 1 log(2 min{p, p 
-1

})

log n).

1

Intro duction

The distortion of an embedding f : X  Y between the metric spaces (X, dX ) and (Y , dY ), is defined as
d X ( ,y )( dist(f ) = sup dY (f (xx,f )y)) ˇ sup dY (f (xx,f )y)) . d X ( ,y )( x=y x=y

When dist(f )  , we say that X -embeds in Y . The question of dimension reduction, i.e., given a set of n points in p what dimension is needed for embedding them in p with (1 + ) distortion, is a d basic problem in functional analysis. The JohnsonLindenstrauss lemma [9] states that d = O(-2 log n) for p = 2. By [8], it also follows from [9] that any set d of n points in 2 can be 1 +  embedded in p where d = O(-4 log(-1 ) log n), for 1  p  2. The p analog of the Johnson-Lindenstrauss lemma is impossible for p = 1 [6]. The case of p  {1, 2, } is open. / In this note we study dimension reduction in p of ultrametrics, which are special type of p metrics. An ultrametric X is a metric satisfying a strong form of the triangle inequality, d(x, z )  max{d(x, y ), d(y , z )}, x, y , z  X . Ultrametrics are important special metric spaces having many applications in biology, approximation and online algorithms. It is a folklore fact that n-point ultrametrics are isometrically embeddable (n) in O . p An equilateral set (which is a special case of ultrametric), can be (1 + )-embedded in d , with d = p O(-2 log n), by identifying it with a random subset of {0, 1}d. A lower bound on the dimension of d = ((2 log(-1 ))-1 log n) for (1 + )-embedding of n-point equilateral set in 2 is shown in [1]. We prove d

This improves upon [4], which provided O(1)(log n) embedding of ultrametrics into O . For p = , p (log n) ultrametrics are isometrically embeddable in O  [10]. As a consequence of Proposition 1.1, it follows from [3, 7] that any n point metric space can be O(log n) (log n) probabilistically embedded in O . Additionally, it p can be used to obtain metric Ramsey-type theorems for low dimensional p [5, 4]. 2 The emb edding

We will need the following refinement of ultrametrics. Definition 2.1. [3]1 For   1, a -exact hierarchically well-separated tree (-eHST) is a finite metric space H defined on the branches of a rooted infinite tree having a finite number of branches, where each branch is infinite. For each branch x, ai (x) is the vertex at depth i on x. Denote by lca(x, y ) the least common ancestor of x and y in T , i.e. the deepest vertex in x  y , and by dlca(x, y ) its depth. The distance is then defined to be dH (x, y ) = -dlca(x,y) . The following is a useful property of -eHST (cf. [5]): Lemma 2.1. For any  > 1, any finite ultrametric is  embeddable in some -eHST. We also need a Chernoff-type bound (cf. [2, A.1.16]).
-1 Lemma 2.2. Let (Xi )t=0 be a sequence of independent i randomi variables, E [Xi ]i = ľi and Xi  [0, M ]. Let S= Xi , and ľ = ľi . Then Pr[S - ľ   M ] < 2 2 e- /2t , and Pr[S - ľ  - M ] < e- /2t .

Science Department, The Hebrew University, Jerusalem. email: yair@cs.huji.ac.il, mendelma@yahoo.com. Supported in part by a grant from the Israeli Science Foundation (195/02), and by the Landau center.

 Computer

Proof of Proposition 1.1. Here we only consider the case p  1/4 (the case p > 1/4 is simpler). The construction as described below is not algorithmic, yet it can be made so with minor changes.
definition appearing here is slightly different. The infiniteness of the tree in the current definition is inessential.
1 The

Copyright Š 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688

664


Assuming d  (2p)2 , we obtain, by applying Let U be an n-point ultrametric. Using Lemma 2.1, we first -embed U in an n-point -eHST H , where  Lemma 2.2, Y  will be determined later. Denote by T the defining tree Pr k,s > (1 + )p [Yk,s ] of H .  Y as Denote by d the dimension of the host space, to be Mk Pr k,s - [Yk,s ] > determined later. Let (ei )i{0,...,d-1} be the standard 16(2p)2/ log(d ) Y < T - as orthonormal basis in Ęd , and let (ei )iĆ be its extension Pr k,s - [Yk,s ] > ex p a2 s/ 5 0 0 0 Mk to a periodic sequence, i.e., ei = ei mod d . 50 For each vertex u of T we attach a random bit he events {Yk,s < (1 - )p [Yk,s ]} are estimated d/s-1 bu  {0, 1}. The embedding f : H  Ęd is defined similarly. Since W = Ydlca(x,y)+1+j s,s , if all j =0 to be 1+ the bad events above do not happen, we obtain  1-  i embedding of U in p . d -i bai (x) ei . (2.1) f (x) = We are left to fix the parameters , d, and s, as =0 functions of n, , and p such that with a positive probFix x, y  H , and denote by Zj = |(f (x) - f (y ))j |. ability non of the events above will happen. Assume max{-1 , d}  n/2 (otherwise the isometric embedding Let ij = min{i : i > dlca(x, y ), i  j (mod d)}. Then suffices). Fix, p  l p p2 0 0  Zj  -ij -ld = Mij ,  = exp( 20000 log n ), s = 2p2000 log n, d = 2p log 2p s. 2
=0

where Mk =

 . (1- -d )p

-k p

Lemma 2.3. If d  2, then
p [Zj ] 

1 8(2p)2/ log(d )

ˇ

-ij p . (1 - -d )p

Indeed,  = 1 + O(), sp  e, d  (2p)2 . Since a  p, exp(-a2 s/5000d < n-4 . We use the union n) bound: There are 2 2 < n3 /2 "bad" events and therefore with overwhelming probability, this is 1 + O() g embedding in 2 , where d = O( lop (2p) log n). d 2 References
[1] N. Alon. Problems and results in extremal combinatorics, I. Discrete Math. In press. [2] N. Alon and J. H. Sp encer. The probabilistic method. John Wiley & Sons, New York, second edition, 2000. [3] Y. Bartal. Probabilistic approximation of metric spaces and its algorithmic applications. In Proc' of the 37th FOCS, pages 184­193, 1996. [4] Y. Bartal, N. Linial, M. Mendel, and A. Naor. Low dimensional emb eddings of ultrametrics. European J. Combin., 2003. In press. [5] Y. Bartal, N. Linial, M. Mendel, and A. Naor. On Metric Ramsey-typ e Phenomena. In Proc' of the 35th STOC, pages 463­472, 2003. [6] B. Brinkman and M. Charikar. On the imp ossibility of dimension reduction in 1. In the 44th FOCS, 2003. [7] J. Fakcharoenphol, S. Rao, and K. Talwar. A tight b ound on approximating arbitrary metrics by tree metrics. In the 35th STOC, pages 448­455, 2003. [8] T. Figiel, J. Lindenstrauss, and V. D. Milman. The dimension of almost spherical sections of convex b odies. Acta Math., 139(1-2):53­94, 1977. [9] W. B. Johnson and J. Lindenstrauss. Extensions of Lipschitz mappings into a Hilb ert space. In Conf ' in modern analysis and probability, pages 189­206. 1984. [10] N. Linial, E. London, and Y. Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 15(2):215­245, 1995.

Proof. For any l  Ć ,  Z -d l 4-(l+1) , Pr j  -ij 1- -d 1-
p so, [Zj ] 

-ij p (1 - -dl )p ˇ . 4(1 - -d )p 4l

Taking l =

log 2p log( d )

, we conclude the claim.

Ł

Let W = Wx,y = f (x) - f (y ) p . Note that p 1/p -dlca(x,y ) [Wx,y ] / is a constant independent of x, y , so we are left to prove that with high probability, W 1/p is close enough to [W ]1/p . We do so by partitioning W to terms as follows. Let s  Ć , s|d, k  [dlca(x, y ) + k +s p 1, dlca(x, y ) + d], and Yk,s = i=k -1 Zi mod d . Applying Lemma 2.3,
[Yk,s ] 
1 8p2/ log(d ) (1- -d )p k+s-1 i =k

-ip

1 - -sp =  -k p d) 8(2p)1/ log( (1 - -d )p (1 - -p ) s Mk ,  8(2p)2/ log(d ) 2 where the last inequality is valid when sp  e.

665