IRE
Information Retrieval Experiment
Retrieval effectiveness
chapter
Cornelis J. van Rijsbergen
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Lin[OCRerr]it[OCRerr] to retrievd'l 4[OCRerr]
rilanner of representation that leads to the limitation on effectiveness. In one
[OCRerr]cnse this is true. However, consider the situation where we could characterize
md discriminate our documents perfectly, and let us assume that the system
is very large and that queries can be arbitrarily complicated. Of course I do
riot believe that for an almost infinite collection we can represent documents
perfectly with anything less than the complete semantic content (disregarding
I rguments as to how this might be done). To achieve perfect retrieval in such
I case, one would need to know already what one was looking for, and hence
the retrieval system would be irrelevant. The point of a retrieval system is to
tell you what you do not already know in response to an incomplete statement
of your information required.
The computational process of locating likely relevant documents is not
unlike the mechanical process of proving theorems within a formal system.
If we assume, not unreasonably, that a computational model for retrieval is
in power equivalent to arithmetic, then G6del's theorem by analogy also
Icads one to suspect that certain statements will be undecidable; or to put it
more precisely, there will be documents whose relevance or non-relevance is
`undecidable' within the formal system for retrieval, i.e. documents which
the system cannot guarantee to find. What I am arguing is that there may be
an essential incompleteness about retrieval systems, which although not
apparent in current implementations, may well become apparent with
increased sophistication of our systems.
At the other end of the scale we have performance trivially limited by 0 per
cent precision and 0 per cent recall. A less superficial bound is the one set by
random retrieval, which is represented most naturally by recall fallout. Any
retrieval strategy worth its salt will do better than random.
The limit set by random retrieval can be discussed at two levels: a global
and a local level. At the global level it simply indicates that we can expect to
do better overall by some retrieval strategy than retrieving randomly from
the collection. At a local level a limit can be used in two ways. First, to set a
natural cut-off when ranking with respect to the probability of relevance; at
some point down the ranking the estimated P(relevance/x) will equal the
prior probability P(relevance) which is the point at which we are retrieving
randomly, and beyond which point the strategy becomes useless. Of course
this `random' cut-off point will almost always exceed any cut-off set by the
user. But since in most experiments we are dealing with simulated users it is
as well to rank to this conservative cut-off point. Secondly, we can use
random retrieval at a local level to motivate a particular way of interpolating
between precision-recall points. It was pointed out earlier that in the
predictive method of evaluation we needed to interpolate. One possible
method is by step-function, the jumps occurring at actual changes in recall.
One way of motivating, and to some extent justifying, this method is to argue
in reverse, and say that given any recall-precision point which has been
calculated for the retrieved set then it is always possible to retrieve randomly
from this retrieved set. Therefore it is always possible in this way to generate
points at the same precision value increasing in recall up to the recall of the
given point. Thereby we can interpolate points between any two given points
by defining firstly:
G = {(R0, P[OCRerr])Y