Online Prediction with Privacy Jun Sakuma jun@cs.tsukuba.ac.jp University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577 Japan Japan Science and Technology Agency, 4-1-8, Honcho, Kawaguchi, Saitama, 332-0012 Japan Hiromi Arai arai.hiromi.ga@u.tsukuba.ac.jp University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577 Japan Abstract In this paper, we consider online prediction from expert advice in a situation where each expert observes its own loss at each time while the loss cannot be disclosed to others for reasons of privacy or confidentiality preservation. Our secure exponential weighting scheme enables exploitation of such private loss values by making use of cryptographic tools. We proved that the regret bound of the secure exponential weighting is the same or almost the same with the well-known exponential weighting scheme in the full information model. In addition, we prove theoretically that the secure exponential weighting is privacy-preserving in the sense of secure function evaluation. If the learner can observe loss only from one expert chosen by the learner, the setting is called the partial information model. In this study, we consider a situation where each expert does not wish to disclose these predictions and losses for reasons of privacy. The following two problems pose intuitive examples where privacy preservation is required in online prediction. Stock price prediction. Let there be investors who independently predict the expected price of a certain stock once a day. Due to the proprietary nature of the prediction function used by each investor, they do not wish to disclose their own prediction function and expected price. However, they wish to improve the accuracy of the prediction by jointly aggregating the prediction of the other investors. How can investors make a better prediction without violating the confidentiality of the prediction function? Prediction of infection outbreak. Let there be hospitals which attempt to predict outbreaks of pandemic diseases each day by analyzing personal medical records stored in each hospital. Aggregation of predictions is expected to improve the prediction accuracy; however, such aggregation might cause privacy violation of patients. How can hospitals make a better prediction without sharing sensitive medical information? These problems can be considered as online prediction problems under limited loss observation, in which prediction and loss cannot be shared among experts (hospitals or investors) at all, in order to preserve confidentiality. Thus, these problems can be formulated neither in the full nor partial information model. In this study, we focus on the fact that, in some cases, each loss can be observed by at least an expert but not disclosed to others due to its secrecy. If we could exploit such loss information associated with an online prediction in such a way that the loss is not disclosed or estimated by others, online prediction under limited 1. Introduction Online prediction with expert advice is a type of learning algorithm in which a learner predicts the classification of sequentially generated examples. At each time point, experts independently make a prediction for the next example. Then, the learner makes a prediction using the prediction provided by each expert. Typically in this situation, no statistical assumptions are made about the process generating the target sequence. The performance of online prediction algorithms is often evaluated by the regret function, which is the difference between the loss of the learner and the loss of the expert suffering the least loss. If the learner can observe loss from all the experts, the setting is referred to as the full information model. Appearing in Proceedings of the 27 th International Conference on Machine Learning, Haifa, Israel, 2010. Copyright 2010 by the author(s)/owner(s). Online Prediction with Privacy Table 1. The regret bound and information model info. model regret bound comp. time full REW,T 2T ln N small partial RExp3,T 2 - 1 N T ln N e small private RSFE/DR,T ln N 2T large private RSEW/OR,T 2T ln N + O(1) medium online procedure Exponential Weighting (Vovk90) Exp3 (Auer03) SEW with SFE (proposal) SEW with crypto (proposal) privacy loss not cared partly disclosed none none observation could be treated as an instance of the full information model. Such a treatment might appear to be impossible. In the field of security and cryptography, protocols for private distributed computation have been extensively studied for decades. The key idea in this study is in the use of two well studied cryptographic tools, secure function evaluation and a public-key homomorphic cryptosystem. By using these tools, we can design privacy-preserving online prediction protocols that achieve regret minimization with neither the learner nor the expert being able to observe private loss and prediction. When a specified function is evaluated with taking distributed private information as input, two types of privacy should be considered, privacy-preserving computation and output privacy. The former aims to evaluate the function without violating the privacy of the distributed inputs; the latter attempts to measure the amount of information leaked thorough the publication of the output. Our focus in this study is on privacypreserving computation of online prediction. Recently, it is well recognized that differential privacy mechanism provides a theoretical treatment of output privacy (Dwork, 2006). Simple combination of our solution with the Laplace mechanism (Dwork, 2006) might provide a solution both for privacy-preserving computation and output privacy in online prediction. This is remained for the future work. 1.1. Related Work If the learner is allowed to observe the loss of all experts (full information model), the regret of exponen tial weighting strategies is at most O( T ln N ) where N and T are the number of experts and time steps, respectively (Vovk, 1990), (Littlestone & Warmuth, 1994), (Cesa-Bianchi et al., 1997). A number of studies have been presented concerning online prediction under limited observation of loss. The multi-armed bandit problem or the partial information model (Blum & Mansour, 2007) assume that the learner cannot observe loss values suffered by experts other than the expert chosen by the learner. Auer et. al. have shown that the regret bound of the exponential weighting scheme in this setting is at most O( N T ln N ) (Auer et al., 2003). Comparison of these solutions in terms of the regret bound and privacy loss is summarized in Table 1. These schemes can cope with limited loss observation. However, these are not designed with the intention to protect privacy. Therefore, private information could be partly disclosed or statistically estimated after iterations. 1.2. Our Contribution In this study, we formally introduce a novel information model, termed as the private information model (Section 3). Intuitively, in this model, the learner is not allowed to observe experts' losses and predictions; the experts are not allowed to observe the learner's and other experts' losses and predictions, either. Our contribution is as follows. Firstly, we introduce a new problem termed as oblivious roulette; roulette playing without seeing the roulette wheel. Then we show that online prediction in the private information model is equivalent to this oblivious roulette problem (Section 3.3). Considering this, we present two types protocols for oblivious roulette (Section 5). The first and second protocols make use of secure function evaluation (SFE) and a homomorphic cryptosystem respectively, to guarantee security (Section 4). These protocols help to convert existing online prediction schemes in the full information model to those in the private information model. Then, as privacy-preserving online prediction, secure exponential weighting (SEW) is presented using these roulette prove that the regret protocols (Section 5.3). We bound of SEW is at most 2T ln N for the pro tocol using SFE (RSEW/DR,T ) and 2T ln N + O(1) for the protocol using the homomorphic cryptosystem (RSEW/OR,T ). Furthermore, we prove theoretically that SEW is privacy-preserving in the private information model. Comparisons between our solution and existing solutions are summarized in Table 1 again. In order to demonstrate the efficiency of our solution, the results of computational experiments are shown (Section 6). 2. Preliminaries Let there be a learner and a set of experts E = {1, ..., N }. The goal of online prediction is to predict Online Prediction with Privacy an unknown outcome sequence y1 , y2 , ... of elements of an outcome space Y. The outcome sequence is provided by the environment arbitrarily. In this paper, we consider the outcome space to be discrete. Then, the outcome space can be regarded as a set of integers Y = {1, 2, ...Y } without loss of generality. In the full information model, the learner attempts to predict yt with observing the prediction of experts (advice) y = (y1,t , y2,t , ..., yN,t ) at time t. The loss resulting from the prediction is evaluated by the loss function : Y × Y [0, 1]. Let pt = (p1,t , ..., pN,t ) be a probability vector called the strategy. The learner chooses the i-th expert with probability pi,t for its decision and the learner's prediction is set to the chosen expert's prediction. Note that pt is maintained by the learner to determine its prediction at each time, but is not considered as the learner's output. Let j M (i; pt ) denote the randomized choice of an expert following strategy pt . Online prediction in the full information model is described as follows: Online prediction in the full info. model 1. the environment arbitrarily chooses the next outcome yt 2. each expert makes a prediction and reveals the prediction (y1,t , ..., yN,t ) to the learner 3. the learner determines strategy pt with observed loss values and obtains the prediction yt = yj,t ^ where j M (i; pt ). 4. the environment reveals the outcome yt 5. the learner suffers a loss (yt , yt ), t t+1. Then ^ return to Step 1. Let i,t = (yi,t , yt ) be the i-th expert's loss at time t. Let H be an online algorithm of the learner which updates his strategy. Then, the learner's loss at time t can be considered as the expected loss of the randomn ized strategy of H, H,t = i=1 pi,t i,t . The total loss T of H is defined by LH,T = t=1 H,t . Let Li,T = t=1 i,t be the total loss of the i-th expert during T time steps. Then, the regret of the online algorithm H is defined by RH,T = LH,T - mini{1,...,N } Li,T . The learner updates pt so as to minimize the regret during T prescribed time steps1 . The exponential For simplicity, we consider the case that the number of time steps T is known by the learner in advance. One can apply "guessing techniques" which guarantee a similar regret bound for unknown T (Auer et al., 2003). 1 weighting scheme employs the following procedure to update the strategy. At t = 1, wi,1 is initialized as 1/N for all i. Then, for t 2, t-1 wi,t Wt = = exp - N (yi,s , ys ) s=1 (1) (2) wi,t , pi,t = i=1 wi,t Wt where > 0 is a user parameter. Then, for any outcome sequence, when the loss function is convex in the learner's prediction, the regret of the exponential weighting (EW) is bounded (Vovk, 1990; Littlestone & Warmuth, 1994), thus: LEW,T - i{1,...,N } min Li,T 2T ln N . 3. Privacy in Online Prediction As discussed in the introductory section, the main objective of this study is to introduce the notion of privacy into online prediction, by means of the private information model. In this section, we define the private information model in comparison with the full and the partial information model. 3.1. Sequence Observation Models Before presenting the definition, sequence observation models of sequences are defined. Given a set of sequences, the model defines which parts of sequences are observable by a party. Definition 1. (public and private) Let there be a party and a sequence x = (x1 , x2 , ...). If the party can observe (x1 , x2 , ..., xt-1 ) but not the other elements at time t, then x is public. If the party cannot observe any elements of x at any time, then x is private. Definition 2. (d-private) Let there be a party and N sequences x1 , ..., xN . Let xi = (xi,1 , xi,2 , ...). Let the party hold index vector d = (d1 , ..., dt-1 ) {1, 2, ..., N }t-1 at time t. If the party can observe nothing but (xd1 ,1 , ..., xdt-1 ,t-1 ) from the sequences at time t, then x1 , ..., xN is d-private. In the full information model, the loss sequences of experts are public to the learner. In the partial information model, when d is a decision sequence of the learner, the sequences of loss values of experts are d-private. Table 2 summarizes the two information models. Information models of the experts are not described in this table because an expert is not considered to be a computational party in these models. T Online Prediction with Privacy Table 2. Information model and sequence observation. The i-th expert's seq. Learner's seq. full public -- partial d-private -- private private from private from the other parties the other parties execution, the player learns y = yj where j M (i; p) ^ and information inferred from y , but nothing else. The ^ dealers learn nothing, either. Statement 2. (Oblivious roulette) Let there be N dealers and a player. The i-th dealer arbitrary chooses mi [0, 1] and yi Y as input for i = 1, ..., N . Let m pi = PN i m and p = (p1 , ..., pN ). After the protocol i=1 i 3.2. Private Information Model Based on these sequence observation models, our private information model can be stated as follows: Definition 3. (Private information model) Let there be a learner and N experts. If each party's loss and prediction sequences are private from the other parties, then all parties satisfy the private information model. Table 2 summarizes this model again. The privacy of the learner's sequence is considered here. If the learner's information is not private, the problem becomes slightly easier, but still non-trivial because experts' sequences are still private among parties. Considering cases where the learner is one of the experts, as examples described in the introduction, we assume both the learner's and experts' sequences are private. Finally, the problem of online prediction in the private information model can be stated: Statement 1. (Online prediction in the private information model) Let H be an online prediction algorithm. Let there be a learner and N experts satisfying the private information model at time t - 1. After execution of H at time t, the learner learns a prediction from H and information inferred from the prediction, but nothing else; all parties are still in the private information model. 3.3. Our Approach In this subsection, we describe the outline of our solution. In principle, our solution is designed as a secure function evaluation of the exponential weighting scheme. Recall that the sequence of the j-th expert is private from the i-th expert (i = j). Nevertheless, the i-th expert can independently evaluate Eq. 1 because wi,t is updated only with the loss of the i-th expert. On the other hand, the i-th expert cannot update Eq. 2 because the i-th expert needs to have wi,t for all i for the update of pi,t . In this setting, the learner needs to learn y = yj s.t. j M (i; pt ) without knowing any^ thing except yt . ^ This problem can be compared to an imaginary roulette game, called oblivious roulette. Let there be N dealers and a player. Then the game of oblivious roulette is stated as follows: Note that the player is not allowed to know from which dealer y came in this statement. As mentioned above, ^ we can see that the exponential weighting in the private information model is equivalent to the game of oblivious roulette by replacing the dealers and player by the experts and learner, respectively. In the next section, we introduce two cryptographic tools for solving this game of oblivious roulette securely. Then, two types of roulette protocols are introduced. 4. Cryptographic Tools 4.1. Secure Function Evaluation Secure function evaluation (SFE) is a general and well studied cryptographic primitive which allows two or more parties to evaluate a specified function of their inputs without revealing (anything else about) their inputs to each other (Goldreich, 2004; Yao, 1986). In principle, any private distributed computation, including online prediction, can be securely evaluated by means of SFE. However, although polynomoially bounded, naive implementation of the exponential weighting using SFE can be too inefficient. Therefore, in order to achieve online prediction efficiently in the private information model, we make use of existing SFE solutions for small portions of our computation as a part of a more efficient overall solution. 4.2. Homomorphic Public-key Cryptosystem Given a corresponding pair of private and public keys (sk, pk) and a message m, then c = Encpk (m; ) denotes a (random) encryption of m, and m = Decsk (c) denotes decryption. The encrypted value c uniformly distributes over ZQ (= {0, ..., Q - 1}) if is taken from ZQ randomly. An additive homomorphic cryptosystem allows addition computations on encrypted values without knowledge of the secret key. Specifically, there is some operation · (not requiring knowledge of sk) such that for any plaintexts m1 and m2 , Encpk (m1 + m2 mod N ; ) = Encpk (m1 ; 1 ) · Encpk (m2 ; 2 ) where is uniformly random provided that at least one of 1 and 2 is. Based on this property, it also follows that given a constant k and Encpk (m1 ; ), we can compute multiplications by k via repeated application of ·, Online Prediction with Privacy denoted as Encpk (km mod N ; k) = Encpk (m; )k . In what follows, we omit the random number from our encryptions for simplicity. Our solution makes use of semantically secure2 additively homomorphic encryption, such as Paillier's cryptosystem (Paillier, 1999). Distributed roulette · The i-th dealer's input: mi [0, 1] and yi Y · The player's output: y ^ 1. Initialization: Round k = 1. 2. For i = 1, ..., N , the i-th dealer independently chooses: ai,k yi , Y + 1, with prob. mi , otherwise. 5. Online Prediction in the Private Information Model As discussed, our problem is viewed as the game of oblivious roulette. Here, we consider the distributed roulette protocol shown in Fig. 1. In the rest of this paper, a r A denotes that element a A is chosen from set A uniformly at random. The output of distributed roulette follows Lemma 1. Lemma 1. Let y be the output of the distributed ^ roulette. Then, y = yj where j M (i; p). ^ Proof. In round 1, the player finds aj,k = yj Y with m probability Nj . In round k > 1, the player finds aj,k = Y + 1 Y for any choice of j with probability 1 - PN i=1 mi . Thus, the probability that aj,k = yj Y at N the end of the protocol is k=0 and sends ai,k to the player. 3. The player chooses j r {1, ..., N }. (a) If aj,k Y, the player outputs aj,k as y . ^ (b) Else k k + 1 and go to Step 2. Figure 1. Distributed roulette mj 1- N N i=1 mi k N = mj N i=1 mi = pj . (3) Thus, j M (i; p) holds. Consequently, aj,k = yj where j M (i; p). Thus, this distributed roulette protocol allows the player to play the roulette without sharing mi . However, this is not secure in the sense of Statement 2 because the learner might estimate mi from the sequences of aj,k . Additionally, the learner is aware of j, from which dealer y came at each iteration. ^ 5.1. Distributed Roulette using SFE In this section, we show that the distributed roulette protocol can be made secure efficiently in the sense of Statement 2 by means of SFE. Evaluation of ai,k in step 2 can be performed independently by each dealer, so this step does not reveal any private information to the others. Therefore, we apply SFE only to the message exchange in step 2 and the computation in step 3. SFE is applied as follows. In the beginning, the ith dealer inputs ai,k and ji r {1, ..., N }; the player If the cryptosystem is semantically secure, any information regarding message m is not learned from encryption Encpk (m) for all m ZQ . 2 inputs j r {1, ..., N } to SFE. SFE privately evaluates N j = j + i=1 ji mod N . By choosing j in this way, the player and dealers can randomly choose a single dealer without learning which dealer is chosen. Then, SFE privately checks whether aj,k Y holds. If this holds, aj,k is returned as y . Otherwise, "Play again" ^ is returned. Since SFE does not change the output of any computation, the behavior of the distributed roulette protocol follows Lemma 1. Furthermore, since all messages are exchanged over SFE, the security of this protocol is reduced to that of SFE (Yao, 1986), Thus, the security proof is omitted here. The protocol is expected to be terminated after N N 1/ i=1 mi rounds in average. When i=1 mi is too small, many rounds might be required for termination. However, note that pj is unchanged if a common conN stant is multiplied to mi for all i. Then, i=1 mi can N be controlled so that i=1 mi is not too small. This adjustment is also performed by SFE in the experiments shown in the latter section. Let the protocol be terminated after K rounds. Then, SFE performs modulo addition and equality tests K times for a single roulette play. Since these operations are all elemental, the computation is much more efficient compared to the naive SFE implementation of the roulette. 5.2. Oblivious Roulette using Crypto Next, we present yet another protocol for oblivious roulette by means of a homomorphic public-key cryptosystem. The protocol is shown in Fig. 2. We prove that the oblivious roulette protocol approximately follows distributed roulette. Online Prediction with Privacy Lemma P Let y be the output of oblivious roulette. 2. ^ N m Let µ = i=1 i and N M (i; p, N, Q) = (1 - )M (i; p) + U (i; N ). (4) where U (i; N ) is the uniform distribution over {1, ..., N }, Q is a security parameter of the cryptosystem, and = (1-µ) N Q 1-(1-µ)(1- N ) Q Oblivious Roulette · Public input: number of dealers N , security parameter Q · Player's input: key pair(pk, sk), · i-th dealers' input: public key pk, mi [0, 1], yi Y · Player's output: y ^ · Dealers' output: none 1. Initialization: Round k = 1. 2. The player chooses jk r {1, 2, ..., N } and sends Encpk (-jk ) to all dealers 3. For i = 1, ..., N , the i-th dealer independently chooses ri,k r ZQ and ( jk , with prob. mi , ai,k Y + 1, otherwise where jk r {1, 2, ..., N }. Then computes . Then, y = yj where ^ j M (i; p, N, Q) Proof. From the homomorphic property of the cryptosystem, eq. 6 in step 3 is reorganized as ci,k = Encpk ((ai,k - jk )ri,k + yi mod N ). Then, in step 5, the player learns ui from decryption ui = Decsk (ck ) = ri,k (ai,k - jk ) + yj mod N. (5) Note that the player cannot learn which dealer prepared ui from the order of received messages because the first dealer randomly shuffles the order of messages in step 4. ui can take two types of values dependent on the i-th expert's choice of ai,k in step 3. Case 1 (ai,k - jk = 0): This case happens when the j th dealer's choice was aj,k = jk and this corresponds with the player's choice jk . When this case happens, the player always finds u = yj Y in step 5a since the first term of eq. 5 is 0 regardless of random value ri,k . Case 2 (ai,k - jk = 0): The case happens when (1) the j-th dealer's choice was aj,k = Y + 1, or (2) the j th dealer's choice jk does not corresponds with the player's choice jk . In this case, ui distributes uniformly at random over ZQ since ai,k - jk is non-zero and ri,k is random. Note that even when this case happens, the player finds u = yj with probability 1/Q. In round 1, the player finds aj,k = yj Y with probam 1 bility Nj + (1 - µ) Q . In round k > 1, the player finds aj,k = Y + 1 Y for any choice of jk with probability (1 - µ)(1 - N ). Thus, the probability that the player Q finds aj,k = yj Y at the end of the protocol is p i = = k=0 mi N and sends ci,k to the first dealer. 4. The first dealer randomly shuffles (ck,1 , ..., ck,N ) and sends them to the player 5. The player decrypts ui Decsk (ci,k ) for all i and computes Y = {u1 , ..., uN } Y . (a) If Y = , return u r Y as y . ^ (b) Else, k k + 1 and go to step 2. Figure 2. Oblivious Roulette " "ri,k ci,k Encpk (ai,k ) · Encpk (-jk ) · Encpk (yi ) (6) Remark 1. Let M = Then, = i=1 mi . N 2 -M N . The key size Q is usually set to quite N 2 +M Q-M N large, such as Q = 21024 for security reasons. Considering that M N holds and N 2 Q typically holds, we can regard 0. Next, we show the security of the oblivious roulette protocol. We assume the player and dealers behave semi-honestly; this assumes parties follow a specified protocol properly, but might also use their records of intermediate computations in order to attempt to learn other parties' private information. Lemma 3. Assuming the player and dealers behave semi-honestly, the oblivious roulette protocol is secure in the sense of Statement 2. The proof is omitted here. Intuitively, dealers do not hold the private key, and all messages observed by the dealers are encrypted; the dealers learn nothing. On the other hand, the player holds the public key. However, all the messages observed by the player are randomized and the order thereof is shuffled by the dealers except that the message forms the final output. Thus, N mi 1 + 1-µ N Q 1 + (1 - µ) Q N Q) (1 - µ) 1 - N Q k 1 - (1 - µ)(1 - . m 1 Setting as in Lemma 2, p = (1 - ) PN i m + N i i i=1 holds. Thus, the output follows the probability distribution of eq. 4 . Online Prediction with Privacy Table 3. Computation time per step (second) and info. model. The results are the average of 100 iterations. model N =2 N =4 N =8 N = 16 N = 32 EW full 0.562 × 10-8 1.09 × 10-8 2.03 × 10-8 3.78 × 10-8 7.25 × 10-8 partial 0.534 × 10-8 1.02 × 10-8 1.56 × 10-8 2.75 × 10-8 5.13 × 10-8 Exp3 SEW/OR private 13.8 27.9 56.4 113 233 SEW/DR private 65 164 306 608 1207 the player learns nothing except the final output. 5.3. Secure Exponential Weighting in the Private Information Model Finally, our secure exponential weighting in the private information model is shown. The protocol basically follows the exponential weighting in the full information model shown in Section 2. The differences between these are (1) the i-th expert updates eq. 1 in step 3 instead of the learner, and (2) the distributed or oblivious roulette protocol is used to obtain prediction yt = yj,t where j M (i; pt ). ^ Theorem 1. The secure exponential weighting is secure in the sense of Statement 1. perform secure exponential weighting and can share prediction results in the private information model. 6. Experiments and Discussion We performed experiments to examine the computational efficiency of our protocol. Programs were implemented in Java. As the cryptosystem, (Paillier, 1999) with 1024-bit keys was used. For secure function evaluation, Fairplay (Malkhi et al., 2004) was used. The environment generate yt = {0, 1} randomly at each time. Each expert is assigned a prediction accuracy which uniformly distributes from 0.5 (random) to 0.75 (the best accuracy). Each expert makes a random prediction at each time following the assigned prediction accuracy. We compared exponential weighting learner in the full information model (EW), Exp3 learner in the partial information model (Exp3), and two secure exponential weighting learners, one uses distributed roulette with SFE (SEW/DR) and the other uses oblivious roulette (SEW/OR). Fig. 3 shows the regret of each learner. Since SEW/DR behaves exactly the same with EW, both are not separated. The regret of SEW/OR is almost the same with EW, too. It is remarkable that SEW learners in the private information model suffer less regret than Exp3 learner in the partial information model. In the partial information model, unobserved information is never exploited for prediction. In our protocol, the use of cryptographic tools allows the learner to exploit hidden experts' predictions for the improvement of the learner's prediction without observing them. Instead, the computation time of SEW learners is much larger than that of EW and Exp3 learner due to cryptographic operations included. Table 3 shows the learner's computation time (cpu time) per step of online prediction. The computation time of SEW/OR per one round is at most a few minutes in this setting. If the interval of decision making is not too short, SEW is sufficiently practical. The computation of SEW/OR is about 2-6 times faster compared to that of SEW/DR. This is because SFE costs larger time even when the protocol is designed such that only primitive operations are processed by SFE. The message exchange in the secure exponential weighting occurs only in the roulette protocol. Since our roulette protocols are guaranteed to be secure, Theorem 1 obviously holds. If distributed roulette with SFE is used, the computation thereof exactly follows the exponential weighting in the full information model; the regret bound is 2T ln N . If the oblivious roulette protocol is used, the regret bound is slightly changed because the strategy of the learner deviates from M (i; p) as shown in Lemma 2. Theorem 2. If the oblivious roulette protocol is used in the secure exponential weighting, the regret bound N is RSEW/OR,T 1- ln N + LSEW/OR + N i=1 Li . 2 2 ln N/T , Assuming 1/T and setting = RSEW/OR,T 2T ln N + c where c = O(1). The sketch of proof is shown in Appendix. Again, can be quite small with large Q. So the regret bound can be almost the same with that of exponential weighting even when oblivious roulette is used. Application to Examples. In examples in Section 1, all experts (investors or hospitals) wish to share prediction results; no specific learner exists. In such a case, the threshold cryptosystem can be used. In this cryptosystem, all parties share a common public key while each party holds a different private key. Decryption cannot be performed by fewer than t parties and can be performed by any group of at least t. By means of the threshold cryptosystem, all experts can jointly Online Prediction with Privacy 0.6 0.55 0.5 0.45 R/t 0.4 0.35 0.3 0.25 0.2 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 time t EW, SEW/DR Learner SEW/OR Learner Exp3 Learner Upper bound (EW) Upperbound (Exp3) Best expert Cesa-Bianchi, N., Freund, Y., Haussler, D., Helmbold, D.P., Schapire, R.E., and Warmuth, M.K. How to use expert advice. Journal of the ACM (JACM), 44 (3):427­485, 1997. Dwork, C. Differential privacy. 33rd International Colloquium on Automata, languages and programming, pp. 1­12, 2006. Goldreich, O. Foundations of Cryptography: Volume 2, Basic Applications. Cambridge University Press, 2004. Littlestone, N. and Warmuth, M.K. The weighted majority algorithm. Information and computation, 108: 212­212, 1994. Malkhi, D., Nisan, N., Pinkas, B., and Sella, Y. Fairplay: a secure two-party computation system. In Proc. of the 13th USENIX Security Symposium, pp. 287­302, 2004. Paillier, P. Public-key cryptosystems based on composite degree residuosity classes. In Eurocrypt'99, pp. 223­238. Springer, 1999. Vovk, V.G. Aggregating strategies. In Proceedings of the third annual workshop on Computational learning theory, 1990. Yao, A. C.-C. How to generate and exchange secrets. In Proceedings of the 27th IEEE Symposium on Foundations of Computer Science (FOCS), pp. 162­167, 1986. Proof of Theorem 2 (Sketch) The proof methodology used here is similar (Cesa-Bianchi & Lugosi, 2006), so the detail is omitted here. The lower bound of ln WT W0 Figure 3. Expectation of regret per time (avg. of 100 iterations, N = 10, T = 5000). 7. Conclusion We presented online prediction algorithms in the private information model, assuming each expert can observe the loss while the loss cannot be disclosed to others due to privacy concern. Our secure exponential weighting for online prediction enables exploitation of private loss values possessed by the experts by making use of cryptographic tools. We proved that the regret bound of our secure exponential weighting is the same or almost the same with the regret of exponential weighting in the full information model. The security of the parties' loss values and predictions are theoretically guaranteed. Our future work is to apply the secure exponential weighting to the repeated game where the payoff matrix of games and players' strategies are desired to be kept private from each other. to is: - min Li,T - ln N. 1- i (7) Acknowledgments ln We would like to thank to the anonymous reviewers for providing thoughtful comments, also thanks to Jia Yuan Yu for deepen discussions. The work is supported by Japan Science Technology Agency, Sakigake (PRESTO) program . WT W0 Using Wt+1 Wt wi,t Wt = pi,t -/N 1- and e-x 1 - x + x2 /2, we have 1- N N /N X - 2 /2 X pi,t (yi,t , yt ) + (yi,t , yt ). 1 - i=1 1 - i=1 References Auer, P., Cesa-Bianchi, N., Freund, Y., and Schapire, R.E. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32(1):48­77, 2003. Blum, A. and Mansour, Y. Learning, regret minimization, and equilibria. Algorithmic Game Theory, pp. 79­102, 2007. Cesa-Bianchi, N. and Lugosi, G. Prediction, learning, and games. Cambridge Univ Pr, 2006. Using ln(1 + x) x and summing over t we get ln N N T T WT +1 /N X X (1 - /2) X X pi,t (yi,t , yt ) + (yi,t , yt ). W1 1- 1 - t=1 i=1 t=1 i=1 Combining eq. 7 with the above equation, we have RSEW/OR From this and = T N 1- X Li . ln N + LSEW/OR + 2 N i=1 (8) 2 ln N/T , the following holds: r r T ln N T ln N T ln N + + T RSEW/OR (1 - ) 2 2 Since < 1/T , RSEW/OR,T = 2T ln N + O(1). p