Multicommodity Facility Location
R. Ravi
Abstract Multicommodity facility location refers to the extension of facility location to allow for different clients having demand for different goods, from among a finite set of goods. This leads to several optimization problems, depending on the costs of opening facilities (now a function of the commodities it serves). In this paper, we introduce and study some variants of multicommodity facility location, and provide approximation algorithms and hardness results for them. 1 Intro duction Facility location refers to the class of problems involving the location of facilities to serve clients in a metric space. The classical uncapacitated facility location problem (UFL), for example, is defined in a metric space, a subset of whose points are clients and another subset consists of facilities. Each facility has an opening cost, and the ob jective is to compute a solution which minimizes the total cost of open facilities and the sum of the distances from each client to its nearest open facility. This problem and its variants, have been studied extensively (See Section 1.3). In this paper, we study a natural extension of the above model where there is a finite set of commodities, and each client demands a subset of those commodities. We call this Multicommodity facility location (MCFL). Facility costs are now a function of the location and the commodities served. This model has many practical applications. For example, a consumer goods manufacturer who produces five different goods might choose to locate three plants to manufacture them, to utilize the economies of scale arising from having one plant producing two commodities instead of a separate plant for each commodity. 1.1 Mo del Our model and notation extend the wellstudied UFL model. There is a set of clients D, and
 GSIA, Carnegie Mellon University. ravi@cmu.edu. Supp orted in part by NSF grant CCR-0105548 and ITR grant CCR-0122581 (the ALADDIN pro ject).  GSIA, Carnegie Mellon University. asinha@andrew.cmu.edu. Supported in part by a CBI graduate fellowship, NSF grant CCR0105548 and ITR grant CCR-0122581 (the ALADDIN pro ject).

A. Sinha
a set of facilities F , with a distance function c : D × F  I + which is a metric. There is also a set of R commodities S , and each client j has demand for one unit of commodity dj  S . The number of clients |D|, facilities |F | and commodities |S | are abbreviated n, m and k respectively. Facility i can be opened in any configuration  (i)  2S , specifying which commodities it is serving, at a cost fi ( )  I + . We assume that R the facilities have decreasing marginal costs, that is, for any facility i and any two configurations ,  , we have fi (   )  fi ( ) + fi ( ). We remark on the implications of relaxing this assumption later. In a version we dub the t-MCFL, every facility has a subset T  S of upto t commodities which may be involved in its allowable configurations reducing the input size from 2k to 2t per facility. Going one step further, we define a compact class of natural facility cost-functions that are linear. In this model, each facility i has an opening cost (fixed cost) fi0 , and for each commodity s, an incremental cost fis , so that the cost of s pening facility i in configuration  is fi ( ) = fi0 + o s  fi . Linear cost functions also satisfy decreasing marginal costs, and in many real situations are a close approximation to general cost functions. Since a linear cost function can be compactly represented by k + 1 numbers, there is hope of devising algorithms with running time polynomial in k , n, m instead of 2k , n, m. A feasible solution is specified by a set of facilities F , together with a configuration  (i) of each facility i  F , such that each commodity which has at least one client demanding it is served by at least one facility in F . Given F , each client j is assigned to (j )  F , the nearest open facility which includes commoidity dj in its configuration. The total facility cost is F fi ( (i)), the sum of the costs incurred in opening each facility in F j in its chosen configuration. The total service cost is D cj,(j ) , the sum of distances from each client to the facility assigned to it. The total cost is the sum of these two, and the ob jective is to minimize the total cost. While the general model allows for clients to specify a vector of demands for the different commodities, we will study the simplified version where each client demands a single unit of a single commodity. This is for ease of exposition; all our algorithms extend

Copyright © 2004 by the Association for Computing Machinery, Inc. and the Society for industrial and Applied Mathematics. All Rights reserved. Printed in The United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Association for Computing Machinery, 1515 Broadway, New York, NY 10036 and the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, PA 19104-2688

342


to the general case - for example, if a client requires several commodities, we can create several copies of the client each requiring a single commodity. In fact, we can even capture the model when service costs "scale" differently for different commodities, by simply weighting the demand at the clients appropriately. 1.2 Our results and pap er outline We begin with a brief survey of existing related work in the next section. In the next section, we address the main t-MCFL problem. In Section 2.1, we first show that the MCFL problem with linear facility costs is as hard as weighted set cover (Theorem 2.1). We match this with a logarithmic approximation algorithm (Theorem 2.5) in Section 2.3 for the problem with linear facility costs running in time O(poly(n, m, k )). Our hardness result also implies that the t-MCFL problem is MAX-SNP hard even for t = 3 via a reduction from weighted set cover with upto t elements per set. Note that the input size to this problem is (2t poly(n, m)). Moreover, the best-known performance ratio for the closely related t-set cover problem achievable in running time O(2t poly(n, m)) is O(log t) [6]. In Section 2.2, we give an O(log t)-approximation for the general t-MCFL problem running in time O(2t poly(n, m)) (Theorem 2.3) matching the best-known results for t-set cover. We study MCFL with additional restrictions in Section 3. First, we study the priority facility location (PFL) problem: Here, each client also has a level-ofservice requirement, indicated by an integer between 1 and k . Each facility has a cost function which is non-decreasing in its level-of-service, ie, fi (s)  fi (s ) whenever s  s . The ob jective is to construct a minimum cost solution where client j with priority dj is served by a facility with level-of-service at least dj . We may think of each priority level as a commodity, so that priority facility location (PFL) is a special case of MCFL. The priority model occurs very often in realworld scenarios; Charikar, Naor and Scheiber [4] studied priority Steiner tree to model multicast networks with level-of-service requirements. While we can model PFL as a special case of MCFL by associating a commodity with each priority level, this results in facility cost functions which are not linear. Hence, the logarithmic hardness result in Theorem 2.1 does not extend to PFL. We provide a 7-approximation algorithm (Theorem 3.1) for PFL in Section 3.1, by adapting techniques of filtering and rounding fractional LP solutions [8, 13]. Once again, the facility cost functions have compact representations, and our algorithm also has running time O(poly(n, m, k )). Our algorithm extends the ideas of the MCFL algorithms, by using the filtering technique

of Lin and Vitter twice, once for service costs as in [13], and once more for facility costs. Finally, we consider the restriction of MCFL where each commodity can be manufactured at only one facility, for example, due to licensing restrictions, or because it is easier to maintain quality control when all goods of a commodity are being manufactured at the same facility. Our hardness result in Theorem 2.1 extends to this case because in the hardness construction, we only have one client per commodity, and hence only the facility serving this client needs to manufacture its commodity. We can extend the results on general t-MCFL to this case to give an O(log t)-approximation for this extension (Theorem 3.2). In the very special case where each facility cost function fi ( ) = fi for all  = , we show this improves to a constant factor approximation (Theorem 3.3). We conclude with some open questions in Section 4. 1.3 Related work The uncapacitated (single commodity) facility location problem has received a lot of attention lately. The first constant factor approximation uses LP rounding and is due to Shmoys, Tardos and Aardal [13]; our algorithm builds on their results and those of Lin and Vitter [8]. Other work includes the current best approximation ratio of 1.52 due to Mahdian, Ye and Zhang [9], and an inapproximability threshold of 1.46 due to Guha and Khuller [5]. The facility location model has been extended to include capacities on facilities [11], online facility location [10], restrictions on the number of facilities opened (the k -median problem) [2], combining facility location with network design [12], etc. However, none of these models incorporate multiple commodities; to the best of our knowledge, our algorithms are the first non-trivial approximation algorithms for multicommodity facility location. Note that an O(log n) approximation can be obtained using a greedy algorithm, such as the following: Repeatedly choose to open a facility and configuration that minimizes the ratio of facility opening cost plus service costs to a subset of clients to the number of clients served by the chosen facility-configuration pair. Also, an O(k )-approximation can be obtained by considering each commodity separately and combining the resulting approximate solutions. Due to the generality and applicability of the multicommodity paradigm, it has been well studied in a lot of contexts, evidenced, for example, by the vast amount of work on multicommodity flow. Also, the multicommodity rent-or-buy problem [7] models a problem where we are required to construct a network which connects all clients of the same commodity, under a special concave cost function on the edges. The priority (or level-of-

343


service) model we study in Section 3.1 is inspired by a recent paper on priority Steiner trees by Charikar, Naor and Scheiber [4], who obtain an O(log n) approximation for their problem. However, in the context of MCFL, the priority model turns out to be a more tractable special case, yielding a constant-factor approximation. 2 Facility lo cation with multiple commo dities We study the general multicommodity facility location problem (MCFL) as defined in Section 1.1. After showing some hardness results, we provide an O(log t) approximation algorithm for the t-MCFL problem in Section 2.2; we then extend our algorithm to a faster algorithm when the facility cost function is linear in Section 2.3. 2.1 Hardness We prove the hardness of MCFL follows by reducing from set cover. An instance of (weighted) set cover is specified by a collection C of subsets of S , each with a weight wc . The ob jective is to find a minimum weight sub-collection C  C such that cC c = S . Arora and Sudan [1] showed that it is impossible to approximate set cover better than (log p) unless P = N P where p is the maximum size of a set.

when the maximum number of allowable commodities in any facility's configuration t equals the total number of commodities k , for simplicity. We then outline briefly the changes to adapt the result to the t-MCFL problem at the end of the subsection. 2.2.1 Overview Our algorithm begins by solving the LP relaxation of the Balinski integer program formulation extended to MCFL. We then filter the solution so that each client is fractionally assigned only to facilities which are close to it. Following this, we select a set of representative clients, and assign all other clients to these representatives. We then view the fractional solution consisting of the fractionally opened facilities and the representatives as a fractional solution to an instance of k -set cover. (The k -set cover problem is a special case of weighted set cover where each set has no more than k elements.) This k -set cover instance can be rounded to an integer solution within a factor of O(log k ) using standard techniques. Finally, we assign the remaining clients to the facilities who serve their representatives. The rounding process resembles the technique used by Shmoys, Tardos and Aardal [13] for UFL, but with suitable adaptations to incorporate the multiple commodities.

Theorem 2.1. Any -approximation algorithm for MCFL with linear facility costs yields a -approximation 2.2.2 IP formulation The IP formulation extends algorithm for weighted set cover. Balinski's formulation [3] of UFL to MCFL. We main Proof. Given an instance of weighted set cover specified tain an indicator variable yi for each configuration  of  by (S, C, w), we transform it into an instance of MCFL. each facility i. Variable xij is 1 iff client j is served by The set of commodities is the set of elements S . We also facility i in configuration  . have one client d  D for each commodity s  S , and i i  a facility i for every set c  C . The distance between min cij xj + fi ( )yi (I PM C F L ) i ,j, , any client and any facility is zero. The cost function of i facility i is fi ( ) = wc if   c and fi ( ) =  otherwise. xj  1 j  D i Any solution to this MCFL instance with cost :dj  less than  is a feasible solution to the set cover  xj  yi j  D, i  F,   2S i instance. Moreover, any minimal solution selects each x, y non-negative integers open facility in only one configuration (namely, its maximal configuration), so that the total cost of the Let LPM C F L denote the linear relaxation of MCFL solution is the same as the cost of the associated I PM C F L . The size of LPM C F L is polynomial in n, m set cover solution. Hence any  approximation for and 2k ; therefore LPM C F L can be solved to optimality MCFL yields a  approximation for weighted set cover. in running time polynomial in its size. We use the optiFinally observe that the facility costs are linear and mal solution of LP M C F L , denoted (x, y ) with ob jective satisfy the decreasing marginal costs property. function value OP TM C F L , as our lower bound. We also note that weighted set cover with upto t elements per set and a total of n elements to cover 2.2.3 Filtering The next step of the algorithm uses is MAX-SNP-hard even when we allow the algorithm the filtering technique of Lin and Vitter [8], as was also running time O(2t poly(n, m)) [6], implying the same for done in [13]. We fix a constant 0 <  < 1. For every t-MCFL via the above reduction. client j , we idefine its optimal fractional service cost to  be c = j , cij xij . Order the facilities which serve 2.2 Approximation algorithm for t-MCFL We client j according to non-decreasing distance from j . first describe in detail the version of the algorithm The  point gj () for client j is the smallest distance

344


  c such that j ,i:cij c xij  . j The following theorem, due to Shmoys, Tardos and Aardal [13], can be proved along similar lines because we can treat each facility-configuration pair as a distinct facility.

cost no more than

i

,

fi ( )y  . i

Proof. Since we select the set of representatives R in such a way that no two representatives of the same commodity I Pk-S C are served by the same facility, the cardinality of any set in I Pk-S C is no more thain k . Also,  since (x, y ) is feasible for LPM C F L , we have Theorem 2.2. [13] The fractional solution (x, y ) can , xij  1    be transformed in time polynomial in n, m, 2k to a for every j  R, and zi = y i  xij guarantees the fractional solution (x, y ) such that (i) xj > 0  cij  feasibility of z for the linear relaxation of I Pk-S C . This i 1 c for al l i  F,   2S , j  D; (ii) c  1- c ; (iii) also bounds the cost of the fractional solution. j j j Lemma 2.2. There exists an integer solution (x|R , y ) ^^ such that (i) xj  yi ; (ii) xj = 1 onliy if xj > 0; ^i ^ ^i i ^ ^ 2.2.4 Selection of representatives The existence (iii) yi = 1 only if y  > 0; (iv) i , fi ( )yi  i 1 1  of multiple commodities presents several difficulties if we Hk , fi ( )y i , where Hk = 1 + 2 + . . . + k  log k . attempt to round the fractional solution (x, y ). Hence we introduce a new step where we select a set of clients Proof. We know that the integrality gap of k -set cover as representatives such that no two representatives of is no more than log k (for example, see [14]). Hence ^ the same commodity are fractionally served by the same there exists an integer solution z of I Pk-S C with cost i no more than log k fi ( )y  . We therefore set i facility. We do this representative selection indepen, y = z , guaranteeing (iii) and (iv) above. Also, using ^ ^ dently for each commodity. ^ Fix a commodity s, and consider all clients which the feasibility of z and the construction of I Pk-S C , for  every client j  R we must have yi = 1 for at least ^ require commodity s in increasing order of cj . Let Ds =  {j1 , j2 , . . . , jns } be the clients in this order. Iteratively, one facility-configuration pair (i,  ) such that xij > 0.  ^ mark the smallest index (smallest c ) client j  Ds as a Define xij = 1 for this (i,  ) pair, and zero everywhere j  else. This guarantees (i) and (ii). representative. All clients j Ds such that there exists a facility-configuration pair (i,  ) such that xj > 0 and i In our final solution, we only open those facilityxj > 0 are removed from the list Ds (In this case, note configuration pairs with y  = 1. Theorem 2.2 and i ^ that dj = dj   ). We do not consider client j as a Lemma 2.2 bound the faciility opening cost of by Hk  candidate for being a representative, and instead mark of the facility cost of OP T M CF L. . client j as the representative for client j Let R denote the set of representatives over all commodities, and x|R 2.2.6 Bounding the service cost Clients in R are denote the set of xj variables for j  R. i assigned to facility-configuration pairs (i,  ) according 2.2.5 Interpretation as a fractional k -set cover solution The k -set cover problem is a special case of the set cover problem when each set has cardinality no more than k . We use our fractional solution to construct a k -set cover instance and show that y is a fractional solution of the k -set cover instance. We use the following IP formulation for our instance of k -set cover. We create a set (i,  ) with cost fi ( ) for every facility-configuration  pair (i,  ). Let zi be 1 if set (i,  ) is included in our solution. Our universe consists of all clients in R, and a client j  R is included in a set (i,  ) if and only if xj > 0. i i  min fi ( )zi (I Pk-S C )
, / to the variables xj . For a client j  R, we assign ^i it to the facility-configuration pair which serves the representative j of client j . Clearly, this results in a feasible solution of I PM C F L .

y   min{1, i

 yi }

for al l i  F,   2S .

Lemma 2.3. Each client is assigned to a facility at 3 distance no more than 1- c . j Proof. For each client j  R, Theorem 2.2, parts (i) and c j (ii) guarantee that xj > 0 only if cij  1- . Consider i / a client j  R, and let j  R be its representative. We 2 therefore have c  c , so that cj j  2c  1- c . j j j j Since client j is assigned to the facility (j ) which serves 3 2 j , we have c(j )j  c(j )j + 1- c  1- c . j j Using Lemmas 2.2 and 2.3, we can bound the total cost of our solution.

s.t.

(
i, ):x >0 ij

Theorem 2.3. Our algorithm produces a solution with total facility cost no more than Hk times the optimal  3 Lemma 2.1. I Pk-S C is an instance of k -set cover, and facility cost and service cost no more than 1- times z = y is a feasible solution for its linear relaxation of the optimal service cost for any 0 <  < 1.

 zi

 1 j  R

345


Suppose each facility i has an opening cost (fixed cost) fi0 , and for each commodity s, an incremental cost fis , such that the cost of opening facility i in s s configuration  is fi ( ) = fi0 + We call  fi . this class of facility cost functions linear. Linear cost functions also satisfy decreasing marginal costs, while 2.2.7 Consequences of decreasing marginal in many real situations linear cost functions are a close costs The algorithm above essentially treats each approximation to general cost functions. Since a linear facility-configuration pair as a distinct facility. Hence it cost function can be compactly represented by k + 1 may end up opening the same facility in more than one numbers, one would expect an algorithm with running configuration. However, if we have decreasing marginal time polynomial in k , n, m instead of 2k , n, m. We show costs (fi (   )  fi ( ) + fi ( )), we can open the fa- that a slight modification of our algorithm satisfies this. cility in exactly one configuration, which is the union of all the configurations chosen for the facility. Decreasing 2.3.1 IP formulation We use a slightly different 0 marginal costs guarantees that this only improves the IP formulation for linear MCFL. Variable yi indicates s cost while maintaining feasibility. whether or not facility i is opened, and variable yi If we did not have decreasing marginal costs, our indicates whether or not facility i is serving commodity algorithm might require us to open multiple copies of s. Recall that dj is the commodity demanded by client facilities, in different configurations. The model where j . each facility is restricted to be opened in a single i i sk configuration in the absence of decreasing marginal s min cij xij + fis yi (I PL-M C F L ) costs can be used to model facility location with hard ,j F =0 capacities [11] by setting the cost of opening a facility i xij  1 j  D of capacity u to be infinite for any set of more than u commodities. For this problem, the only known d (constant-factor) approximation algorithm uses local xij  yi j j  D, i  F search [11], and no IP formulation with a bounded s 0 yi  y i i  D, s  S integrality gap is known. x, y non-negative integers 2.2.8 Extension to t-MCFL The main modificaLet LPL-M C F L denote the linear relaxation of tion is that the IP formulation now only involves upto I PL-M C F L . The size of LPL-M C F L is poly(n, m, k ), 2t configutations per facility i which is the power set and hence we can solve it optimally in time of the set of commodities Ti  S that the facility can O(poly(n, m, k )). Let (x, y ) denote an optimal solution produce. this implies that the transformed fractional of LPL-M C F L ; as before, we use its value as our lower solution can be interpreted as a fractional t-set cover bound. problem. The rest of the modifications are straightforward and lead to the following result. 2.3.2 Filtering and selection of representatives Theorem 2.4. The above algorithm can be adapted to The first two steps of our algorithm are identical to the output a solution with total facility cost no more than algorithm for general MCFL. We fix our constant  and Ht  times the optimal facility cost and service cost no obtain an -close solution (x, y ), and we select a set of 3 more than 1- times the optimal service cost for any representatives R such that no two clients with the same commodity are served fractionally by the same facility. 0 <  < 1. 2.3 Linear cost functions The running time of the above algorithm is O(2k poly(n, m)) because if we have no other assumptions on facility costs, we are forced to tabulate the cost of each facility in every configuration. However, if the facility cost functions have compact representations, it is possible to improve the running time. In this section, we study a class of facility cost functions which are compact and yet useful, and show that our algorithm runs in time O(poly(n, m, k )) for these instances. 2.3.3 Interpretation as a fractional k -set cover solution Our k -set cover instance is defined as in I Pk-S C , with an element for every client j  R and a set for every facility-configuration pair. Client j is included in a facility-configuration pair (i,  ) only if xij > 0 and dj   . This by itself is an instance which is exponential in the number of commodities. However, the following lemma casts our fractional solution y as a fractional solution to the k -set cover instance so that only polynomially many facility-configuration pairs

Notice that in the above result, we might be able to trade off  with respect to k to get improved guarantees for specific instances depending on the relative contributions of facility costs and service costs in an LP relaxation.

346


have non-zero fractional variables. This allows us to Once again, I PP F L has size polynomial in n, m, k . restrict our attention to a k -set cover instance (and Let (x, y ) denote the optimal solution to the linear fractional solution) of size O(poly(n, m, k )). relaxation LPP F L of I PP F L . 3.1.2 Two-stage filtering The first step of our algorithm is unchanged; we obtain an -close solution (x, y ) satisfying the properties mentioned in Theorem 2.2. Our next step is the crucial second-stage filtering which enables us to take care of the priorities. The idea is to obtain a fractional solution where each client is Proof. Consider facility i, and order the commodities served by facilities which cost no more than 1 times the so that y 0  y 1  y 2  . . .  y k . Let [s] = average cost of the facilities serving it, for  ome fixed i i i i s {1, 2, . . . , s}. We open facility i in configuration [s] constant  . [s] to extent zi = y s - y s+1 for s = 1, 2, . . . k - 1, and i i [k ] zi = y k . The fractional solution z can be verified to Lemma 3.1. The fractional solution (x, y ) can be transi formed in time polynomial in (n, m, k ) to a fractional satisfy the properties stated in the lemma. solution (x, y ) such that (i) (x, y ) is feasible for LPP F L ; ~~ ~~ k i xsj  1 for al l clients j  D; (iii) ~i The rest of the algorithm proceeds as before, by first (ii) s=dj F i s rounding the fractional k -set cover solution and then xs > 0  f (s)  1 k ~ij i s=dj F xij fi (s) for al l j  D ;  assigning clients in D \ R to the facilities which serve 1 (iv) yi  min{1, 1- y s } for al l facilities i  F and al l ~s i their representatives. This yields the following analog s = 1, 2, . . . , k . of Theorem 2.3. Proof. For each client j , we do the following. Order Theorem 2.5. There is an algorithm running in time the variables xsj in non-decreasing order of fi (s), and i O(poly(n, m, k )) which produces a solution for linear  define the  -cost to be the smallest cost fj such that Hk ( MCFL with facility cost no more than  times the 1 s s s ~  3 i,s):fi (s)fj xij   . Define xij = min{1, 1- xij } if optimal facility cost and service cost no more than 1-  fi (s)  fj , and 0 otherwise. Finally, for every facilitytimes the optimal service cost. 1 priority pair (i, s), define yi = min{1, 1- y s }. It ~s i 3 Extensions can easily be verified that (x, y ) satisfies the properties ~~ 3.1 Priority facility lo cation In this section, we stated. provide a constant-factor approximation algorithm for The traditional (first) filtering according to service PFL. Recall that the facility cost functions have comcosts allows each representative client to be served by pact representations, and hence our algorithm also has any facility with xij > 0 and other non-representative running time O(poly(n, m, k )). Our algorithm extends clients by a nearby representative's facility. The new the filter-and-round ideas of the MCFL algorithms, by (second) filtering according to facility costs that we using it twice, with the second application aimed at satintroduce allows each representative client to open the isfying the priority constraints. highest priority facility with xij > 0 thus serving all other non representative clients depending on it at an 3.1.1 IP formulation We begin with the following adequate priority level. s IP formulation for PFL. Variable yi indicates whether or not facility i is operating at level-of-service s, and 3.1.3 Rounding We now round (x, y ) into an integer ~~ variable xsj is 1 if client j is served by facility i operating i solution (x, y ) which is feasible for I PP F L and doesn't ^^ at level s. cost much more than (x, y ). ~~ k k We i rocess clients j  D in increasing order of p is is s s min cij xsj + fi (s)yi (I PP F L ) c = i j ,s cij xij . Initially, all clients are "unserved". ,j =1 F =1 Pick the unserved client with least c , and let Fj be the j set of all facility-priority pairs (i, s) with xsj > 0. Let ~i sk i (i , s )  Fj maximize s among all facility-priority pairs xsj  1 j  D i i =dj F in Fj . Open facility i in priority s (ie, set y s = 1), and ^ s close all others in Fj (ie, set yi = 0). All clients j such ^s xsj  yi j  D, i s that xij > 0 for some (i, s)  Fj are assigned to (i , s ) ~ i  F, s = 1, 2, . . . , k i (ie, we set xs j = 1 and xsj = 0 for all other (i, s)). ^ ^i x, y non-negative integers Lemma 2.4. There is a fractional solution z to the k -set (  cover instance I Pk-S C such that (i) i, ):x >0 zi  1 ij for al l j  R; (ii) for every facility i, there are at most  k configurations for which zi > 0; (iii) the total cost of i  the fractional solution is no more than , fi ( )y i .

347


Theorem 3.2. There is a polynomial time algorithm Theorem 3.1. The procedure outlined above runs in for the restricted version of t-MCFL with approximation time O(poly(n, m, k )) and produces an integer solution ratio O(log t). (x,i y ) such that (i) (x, y ) i is feasible for I PP F L ; (ii) Proof. The service cost of our solution is bounded by ^^ ^^ 1 s fi (s)yi   (1- ) ^s ,s fi (s)yi ; (iii) every client a constant by Lemma 3.2 and Theorem 2.4, while the ,s 3 j is assigned to a facility i such that cij  1- c . facility cost of our solution is bounded within O(log t) j of the optimal since we only open facilities opened by 1 Choosing  = 4 and  = 2 yields a 7- the relaxation where we ignore the constraint that every 7 approximation. This can probably be improved by tech- commodity be served at a single facility. This yields the niques such as scaling the facility and service costs dif- theorem. ferently. When the facility cost is the same for all non-null configurations of a facility (fi ( ) = fi  =  for all 3.2 Unique facilities for commo dities In this section, we extend the algorithm in Section 2 to the variant facilities i), we can apply any existing constant-factor approximation for UFL to get the following result. where a single facility must serve each commodity. 3.2.1 Selection of representatives Let Ds be the set of clients which have demand for commodity s. Dejfine the star cost of a node j  Ds to be cj = ^ Let js be the node with minimum star D cj j . s cost among all nodes in Ds , and select js to be the representative for commodity s. Let VS be the set of all representatives. 3.2.2 Selection of facilities Consider the t-MCFL instance obtained by relaxing the requirement that a single facility serve each commodity. Compute an approximate solution to this (using method APX, say), and let (j ) be the facility which serves client j in this solution. Assign all clients in Ds to be served by (js ). Let  (j ) denote the facility which serves client j in some fixed optimal solution. j Lemma 3.2. D cj,(j ) is no more than S C · O P TS C j plus 4 · cj, (j ) , where OP TS C is the service cost D of an optimal solution, and S C is the approximation ratio of APX for the service cost. Theorem 3.3. There is a constant-factor approximation algorithm for the UFL problem when clients are partitioned into classes (commodities) and al l clients in the same class are required to be served by the same facility. 4 Conclusion We have introduced and provided approximation algorithms for various models of multicommodity facility location. All of these algorithms match the best known ratios for the appopriate set covering variants they generalize. Since the number of commodities t involved in a facility is likely to be much smaller than the number of clients n, the O(log t)-approximation ratio represents a significant improvement in the state of the art from the naive O(min(k , log n) result. An intriguing questions is whether the running time dependence on 2k can be removed for general MCFL. In particular, given a cost oracle which could answer questions of the form "Is the cost of serving configuration  no more than x?" can we use such an oracle to obtain an approximation algorithm with running time polynomial in n, m, k with only polynomially many calls to such an oracle? There is hope for such a solution under decreasing marginal facility costs.

This is feasible, since xsj > 0  di  s  di  s , ~i because s was the maximum priority in Fj . We mark all such clients "served", as well as client j . We repeat this with unserved clients till all clients are served. The following theorem can be proved along the same lines as Theorem 2.3.

by a single facility. By triangle inequality, we must have cjs , (j )  cj,js + cj, (j ) , implj ing that cj,(j )  y 2cj,js + cj, (j ) . Observing that ^ Ds cj,js  cjs we can sum the inequality bounding cj,(j ) over all clients j to get the bound claimed in the lemma.

Proof. Consider a commodity s, and let  (s) be the facility which serves it in the fixed optimal solution.  Let js  Ds be the client which minimizes cj, (s) among all clients in Ds . Summing the inequality  cjs , (s)  cj, (s) over all j  S implies that cjs  References ^ j 2 cj, (s) . Sinjce our representative js minimized Ds cj , we have cjs  ^ ^ Ds 2cj, (s) . [1] S. Arora and M. Sudan. Improved low degree testing Next, consider a client j  Ds , and let  (j ) be and its applications. In Proceedings of the 29th Annual the facility which serves it in the relaxation where we ACM Symposium on Theory of Computing, 485-495, ignored the constraint that each commodity be served 1997.

348


[2] V. Arya, N. Garg, R. Khandekar, V. Pandit, A. Meyerson and K. Munagala. Local search heuristic for k-median and facility location problems. In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing, 21-29, 2001. [3] M.L. Balinski. On finding integer solutions to linear programs. In Proc. IBM Scientific Computing Symposium on Combinatorial Problems, 225-248, 1966. [4] M. Charikar, J. Naor and B. Scheiber. Resource optimization in QoS multicast routing of real-time multimedia. In Proceedings of IEEE INFOCOM, 1518-1527, 2000. [5] S. Guha and S. Khuller. Greedy strikes back: Improved facility location algorithms. In Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms, 649657, 1998. [6] M. Halld´rsson. Approximating k-set cover and como plementary graph coloring. In Proceedings of the 5th International Conference on Integer Programming and Combinatorial Optimization, 118-131, 1996. [7] A. Kumar, A. Gupta and T. Roughgarden. A constantfactor approximation algorithm for the multicommodity rent-or-buy problem. In Proceedings of the 43rd Annual IEEE Symposium on Foundations of Computer Science, 333-342, 2002. [8] J-H. Lin and J. Vitter. -approximations with minimum packing constraint violation. In Proceedings of the 24th Annual ACM Symposium on Theory of Computing, 771-782, 1992. [9] M. Mahdian, Y. Ye and J. Zhang. A 1.52 approximation algorithm for the uncapacitated facility location problem. In Approximation Algorithms for Combinatorial Optimization, 229-242, 2002. [10] A. Meyerson. Online facility location. In Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Science, 426-431, 2001. [11] M. Pal, E. Tardos and T. Wexler. Facility location with nonuniform hard capacities. In Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Science, 329-338, 2001. [12] R. Ravi and A. Sinha. Integrated logistics: Approximation algorithms combining facility location and network design. In Proceedings of the 9th International Conference on Integer Programming and Combinatorial Optimization, 212-229, 2002. [13] D. Shmoys, E. Tardos and K. Aardal. Approximation algorithms for the facility location problem. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing, 265-274, 1997. [14] V. Vazirani. Approximation Algorithms, SpringerVerlag, Berlin, 2001.

349