SIGIR 2007 Proceedings Poster Probability Ranking Principle via Optimal Expected Rank H.C. Wu R.W.P. Luk K.F. Wong Dept. of Sys. Eng. & Eng. Mgt. Dept. of Computing Dept. of Computing The Hong Kong Polytechnic University The Hong Kong Polytechnic University The Chinese University of Hong Kong (852) 2609 8332 (852) 2766 4900 (852) 2766 5143 cshcwu@comp.polyu.edu.hk ABSTRACT csrluk@comp.polyu.edu.hk kfwong@se.cuhk.edu.hk This paper presents a new perspective of the probability ranking principle (PRP) by defining retrieval effectiveness in terms of our novel expected rank measure of a set of documents for a particular query. This perspective is based on preserving decision preferences, and it imposes weaker conditions on PRP than the utility-theoretic perspective of PRP. PRP can be applied to a wider choice of models and applications than [4] alone, where the optimality of PRP can be defined as in [4] or by the optimal expected rank as in here. Categories and Subject Descriptors H.3.0 [Information Storage and Retrieval]: General. General Terms Measurement, Performance, Experimentation. Keywords Probability ranking principle and optimization. 1. INTRODUCTION The probability ranking principle (PRP) [1] (formalized by Cooper) has been the basis for a number of probabilistic retrieval models (e.g., [2]). Its shortened form in [3] is: Probability Ranking Principle [3]: If retrieved documents are ordered by decreasing probability of relevance on the data available, then the system's effectiveness is the best to be gotten for the data. Gordon and Lenk [4] found that PRP optimizes utility under the following conditions: (a) well-calibrated predictive probabilities of relevance to documents are used; (b) these probabilities are reported with certainty; (c) the relevance of documents is assessed independently [1], so the order in which the documents are presented does not affect the relevance judgment; (d) the utility function is non-decreasing. Later, they [5] articulated in details the conditions when the PRP is suboptimal in a utility theoretic sense. We propose that PRP is optimal in terms of the expectation value of ranking (called the expected rank), and the conditions for achieving optimal utility can be relaxed or changed completely. We have no preference for or against defining effectiveness by utility, because this depends on the specific applications. The significance of this work is that Copyright is held by the author/owner(s). SIGIR '07, July 23­27, 2007, Amsterdam, The Netherlands. ACM 978-1-59593-597-7/07/0007. Let us define the event space [6] as = (D × Q × R )card (U ) , where D is the set of documents, Q is the set of queries, R is the set of binary relevance values, U is the set of inquirers, and card(.) is the cardinality of its argument. We hypothesize that the inquirers decide whether each document is relevant to query q. The probability, p(r| d, q), is the proportion of inquirers who judge document d to be relevant to q, where r signifies relevance. When card(U) = 1, p(r| d, q) is the subjective probability representing the inquirer's degree of belief that d is relevant to q. For presentation clarity, we assume that there is more than one inquirer in the rest of this paper. Following [7], the relevance judgment of an inquirer is expressed as a decision preference using the strict preference relation, f q , where d f q e means the inquirer prefers document d over document e on the basis of the degree of relevance for topic q. 2. NOTATION 3. OPTIMAL EXPECTED RANK We define the expected rank, E(D, q), of the set, D, of documents as a measure of the system's effectiveness as follows: E ( D, q) rank ( p(d | q, r )) p(d | q, r ) d D (1) where rank(.) returns a rank number based on some ranking strategy using the given probability. We assume that at least one document is relevant to the query, so p(d | q, r) is always well defined. This is usually satisfied in retrieval evaluation. Otherwise, recall, R-precision and MAP are undefined. The ranking strategy that has the best retrieval effectiveness is the one with the smallest expected rank value. The problem of finding the optimal ranking is solved by mapping it to an analogous problem called card(D)/1// F scheduling problem [8] that is solved by the shortest processing time rule, where: (i) d maps to a particular job, (ii) the probability of d maps to the processing time of a job, (iii) the rank number of d maps to the number of times the job is counted, and (iv) the expected rank maps to the total flow time of the schedule. Table 1 shows an example where d1 to d3 have processing time 0.2, 0.3 and 0.1, respectively. Fk is the total flow time of the k-th job, e.g. F3 = 0.1 + 0.2 + 0.3 = 0.6, and rankP(.) is a function that assigns a rank number by decreasing probability. The optimal scheduling strategy turns out to be a greedy algorithm that ranks documents by decreasing probability. 713 SIGIR 2007 Proceedings Poster This is the same strategy as prescribed by PRP, and the proof-bycontradiction of this well-known optimal strategy is in [8]. Table 1: Optimal scheduling/ranking of three jobs/documents. Scheduling Jobs Sequence 1 2 Job = Document d3 d1 F1 0.1 F2 0.1 0.2 F3 0.1 0.2 3×0.1+ 2×0.2+ 3 ×F = Ranking Documents rankP(pk) 3 2 pk 0.1 0.2 E(D,q) = 3×0.1+ 2×0.2+ Key: pk is the processing time of the k-th job, Fk is the k-th job, and F is the average flow time. 3 d2 Total =1 relevant to q. Since ranking is a weak order, (4) and (5) are equal when the following holds for all document pairs, d and e, in D: p ( r | d , q ) p ( r | e, q ) p s ( r | d , q ) p s ( r | e, q ) (6) Note that (6) is weaker than condition (a) and condition (b). We also do not need condition (d) since we do not optimize utility, although (d) is a more general condition than assigning constant utility value to all documents. 0.3 1×0.3 5. CONCLUSION This paper presents a new perspective of PRP in terms of optimizing the expected rank over a collection of documents. It does not require the probability to be reported with certainty nor well-calibrated, provided that (3), (6) and condition (c) hold, as well as the condition that every query for evaluation has at least one relevant document. Since (5) needs the assigned rank and not ps(.), ps(.) may not necessarily be defined over for ps(.) to be consistent with PRP. 1 Total 0.3 =1 1×0.3 the total flow time of Relating (1) to PRP, we need to rewrite the probability, p(d| q, r), using the conditional probability definition as follows: p(d | q, r ) = p(r | d , q) / p(q, r ) × p(d , q) (2) ACKNOWLEGEMENT This work is supported by CERG project # PolyU 5226/05E. Since ranking is not affected by p(q, r), it can be ignored in (2). Also, p(d, q) is the marginal probability over . In our case, the probability, p(d, q), is a constant: REFERENCES [1] Robertson, S.E. The probability ranking principle in IR. Journal of Documentation, 33, 4 (1977), 294-304. [2] Robertson, S.E., Sparck Jones, K. Relevance weighting of search terms. Journal of the American Society for Information Science, 27, 3, (1976), 129-146. [3] Sparck Jones, K., Walker, S., Robertson, S.E. A probabilistic model of information retrieval: development and comparative experiments. Information Processing and Management, 36, 6 (2000), Part 1 779-808. [4] Gordon, M., Lenk, P. A utility theoretic examination of the probability ranking principle in information retrieval. Journal of the American Society for Information Science, 42, 10 (1991), 703-714. [5] Gordon, M., Lenk, P. When is the probability ranking principle suboptimal? Journal of the American Society for Information Science, 43, 1 (1992), 1-14. [6] Robertson, S.E. On event spaces and probabilistic models in information retrieval. Information Retrieval, 8, 2 (2005), 319-329. [7] Yao, Y.Y., Wong, S.K.M. Preference structure, inference and set-oriented retrieval. In Proceedings of ACM SIGIR `91, pp. 211-218, 1991. [8] French, S. Sequencing and scheduling: an introduction to the mathematics of the job-shop. Chichester, England, Ellis Horwood Ltd, 1982. [9] Lafferty, J., Zhai, C. Document language models, query models and risk minimization for information retrieval. In Proceedings of ACM SIGIR `01, pp. 111-119, 2001. [10] Cooper, W.S. Expected search length: a single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation, 19, 1 (1968), 30-41. p(d , q) 1 1 × card ( D) card (Q) (3) because each inquirer judges the relevance of each document to each query. For ranking, the probability in (3) is ignored, and (2) is simplified to p(d | q, r) p(r | d, q) where is the rank equivalence relation [9]. Using (1), the optimal expected rank is: E * ( D, q) = rank P ( p(r | d , q)) p(d | q, r ) d D (4) This is the ranking prescribed by PRP that is optimal given that (3) holds. Taking the average requires condition (c) to hold. The strategy is optimal on average with respect to U, so the counterexample by Cooper (mentioned in [1]) is not applicable. The optimal expected rank measures the average amount of search effort of U because the rank number represents the amount of effort to read the ranked documents from the top. This average rank combines both recall and precision measures because it is defined for all documents in the collection. It is similar to the expected search length (ESL) [10] but it differs from ESL in the treatment of documents with tied ranks. 4. ESTIMATED PROBABILITIES In practice, we assume that we have a set of accurate probabilities {p(d| q, r)} and a set of probabilities {ps(d | q, r)} that are estimated by the system. Since (3) can be determined without inaccuracies, the expected rank of the system, s, is: E s ( D, q ) d D rank P ( p s ( r | d , q )) p (d | q, r ) (5) where the rank number is derived from the system's probability of relevance, and where the probability of document for the expected rank is based on the proportion of inquirers who assign d as 714