Salton Award Lecture Quantum Haystacks C.J. `Keith' van Rijsbergen University of Glasgow Computing Science Glasgow, G12 8QQ `What I cannot create, I do not understand' Feynman Categories and Subject Descriptors H.1.0: Models and Principles, General General Terms: Theory ABSTRACT This acceptance talk is a curious mixture of personal history and developing ideas in the context of the growing field of IR covering several decades. I want to concentrate on models and theories, interpreted loosely, and try and give an insight into where I have got to in my thinking, where the ideas came from, and where I believe we are going. In the last few years I have been working on the development of what might be coined as a design language for IR. It takes its inspiration from Quantum Mechanics, but by analogy only. The mathematical objects represent documents; these objects might be vectors (or density operators) in an n-dimensional vector space (usually a Hilbert space). A request for information, or a query, is taken as an observable and is represented as a linear operator on the space. Linear operators can be expressed as matrices. Such an operator, Hermitian, has a set of eigenvectors forming a basis for the space; which we interpret as a point of view or perspective from which to understand the space. Thus any document-vector can be located with respect to the basis, and we can calculate an inner product between such a vector and any basis vector, which may be interpreted as a probability of relevance. The probability of observing any given eigenvector is now given by the square of that inner product assuming all vectors are normalised. Hence we connect the probability of observation to the geometry of the space. Furthermore, the subspaces of the space make up a lattice structure which is equivalent to a logic. This makes up the entire mathematical structure, and the language for handling this structure is linear algebra: vectors, matrices, projections, inner-products,... neatly captured by the Dirac notation used in quantum mechanics. Our probability is slightly different from classical probability, the same for logic; we end up with quantum logic and quantum probability. A commitment to this kind of mathematical structure, with which to model objects and processes in IR, depends on two critical assumptions. 1. The distances in the space between objects are a source of important relationships with respect to relevance and aboutness. The observation of a property such as relevance or aboutness is user dependent in the sense that a potential interaction is specified by a user through an operator which when measured achieves outcomes with a probability determined by the geometry of the space. 2. The geometry of this mathematical structure and the probability defined on it are closely connected by the following theorem due to Gleason (1957). One may summarise this theorem by saying that the probability of a subspace is given by a simple algorithm derived from a projection onto the subspace and a special kind of operator, namely a statistical operator, or density matrix. And conversely, that given a probability measure on the subspaces then we can encode that measure uniquely through such an algorithm. This is a very powerful theorem and its consequences remain to be explored. So how did I get to this point and form of abstraction? Most of my research work can be divided into contributions to the following areas: a. b. c. d. e. Clustering Evaluation Probabilistic Models Logic Models Geometry. Copyright is held by the author/owner(s). SIGIR'06, August 6­11, 2006, Seattle, Washington, USA. ACM 1-59593-369-7/06/0008. In all these areas I have attempted to search for underlying mathematical structures that would lead to computations. These topics have in common that they depend on the construction of measures on a space which in some sense determines the usefulness or effectiveness of the structure. For clustering one considers mapping from metric spaces to ultrametic spaces and measure the closeness of fit. In the case of evaluation, one starts with a relational conjoint structure and imposes some constraints given by what is to be measured, one then constructs a numerical representation of this structure leading to such 1 measures as F (or E). For probabilistic models the main difficulty is concerned with deciding on an appropriate event space on which to define the `right' probability measures. For me the most significant example in this context was the attempt to construct a Logical Uncertainty Principle which formulated a measure of uncertainty on incomplete logical constructs. This attempt left unspecified the exact form of the measure. In the Geometry of IR I finally managed to formulate that measure as a projection-valued measure. This way of thinking did not appear out of nowhere. It was heavily influenced by the work of Fairthorne(1961) whose work on Brouwerian Logic (an Intuitionistic Logic) was picked up by Salton in his early book on IR. At an earlier stage MacKay (1950) wrote a paper that opened with, `This paper relates to the borderline linking experimental and theoretical physics with mathematical logic, and covers at several points ground which is common to the theory of communication.' He goes on to define an `information-operator' which is very similar in scope and intent to the Hermitian operator above. Maron, who collaborated with MacKay, stated in his 1965 paper, `Therefore, it can be argued that index descriptions should not be viewed as properties of documents: They function to relate documents and users.' One can see that the development of these early ideas was continued to the construction of the Geometry of IR. What does it leave to be done? An attempt should be made to use this design language to build an IR system. On the theoretical front it is worth considering whether it would be better to start with a transition probability space rather than a Hilbert space as Von Neumann did in 1937 (translated in 1981). The assumption that closed linear subspaces will be the elements of our logic can be challenged, as perhaps a construction with different elements is possible. It is not obvious what the best form of conditional probability might be in these spaces. Agreeing on a form of conditionalisation is intimately tied up with how to model contextuality. There is some evidence to suggest that contextuality plays a role in modelling the conjuncton of concepts (Widdows, 2004). Such contexts have been modelled in quantum theory almost from the beginning, for example, Gleason's theorem precludes noncontextual hidden variable theories. Acknowledgments I would like to pay tribute to all my graduate students. I was extremely fortunate to have such a creative collection of students. There is no doubt that they influenced my thinking considerably, and in many cases preserved me from extreme flights of fancy. I acknowledge my considerable intellectual debt to some of the pioneers of our field, Cyril Cleverdon, Gerard Salton, Bill Maron, Bill Cooper, and Karen Sparck Jones. At all times they set standards for me that I desperately tried to meet but rarely did. Finally, I must give credited to a set of colleagues who started on this exciting research path in IR at about the same time as me. In the early days we argued and discussed at length, and it sharpened and clarified my ideas about IR considerably; they are, Stephen Robertson, Nick Belkin, Bob Oddy, Nick Jardine, and Ken Moody. References Fairthorne, R.A. (1961). Towards Information Retrieval, Butterworths. Gleason, A.M. (1957). `Measures on the closed subspaces of a Hilbert space.' Journal of Mathematics and Mechanics, 6, 885893. MacKay, D.M. (1950). `Quantal aspects scientific information.' Philosophical Magazine, 41, 289-311. Maron, M.E. (1965). Mechanized documentation: The logic behind a probabilistics interpretation.' In Statistical Association Methods for Mechanized Documentation. M.E. Stevens et al., (eds) NBS Report 269, 9-13. Von Neumann, J. (1981). Continuous Geometries with a Transition Probability, American Mathematical Society. Widdows, D. (2004). Geometry and Meaning, The Centre for the Study of Language and Information, Stanford. 2