CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Simulated ranking and document output cut-off
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 200 -
Such recall and precision figures can be plotted on a conventional
graph as in Fig. 5.4P, which shows the results of index language I.l.a
{as in Fig. 5.3T} and also index language 1.9.a. These curves can be
compared with Fig. 4. 206P and show the same superiority of index
language I.l.a over index language 1.9.a.
There is, however, an important difference. The positions of the
points in Fig. 4.206P were determined by coordination level cut-offs, and
were therefore random in relation to each other. With Fig. 5.4P, if
sti'aight lines are drawn radiating from the point of origin, these will. as
can be seen, pass through the corresponding points in each curve. This
is due to the fact that the cut-off is based on document output, and
recall and precision ratios are now interdependent. It is known that
there are 198 documents relevant to the 42 questions, so, on average,
4.7 documents are relevant to each question. When only one document is
retrieved for each question, even ff every such document were relevant,
[OCRerr]he recall ratio could not possibly be higher than [OCRerr] = 21.2%,
although it would, of course, represent a precision rSt]5 of 100%. If any
of the documents are not relevant, then the recall ratio will always fall
on some point along the line which goes from the point of origin to a
recall of 21.2% at 100% precision. Therefore at any given document output
cut-off, a drop in recall ratio with any one system as against any other
system must also involve a drop in the precision ratio. Similarly, when
two documents are retrieved in each search, the maximum recall ratio is
42.4% and with this particular document]question set, 100% recall cannot
possibly be reached until at least five documents are retrieved for each
question. This would, however, represent a total of 210 documents. Since
there are only 198 relevant documents in the collection, the theoretical
maximum precision ratio would then be [OCRerr] x 100 : 94.3%. As more
documents are retrieved, so the maximum possible precision ratio must
drop, and these document output cut-off performance lines can be calculated
as has been done in Fig. 5.4P.
Because of the fact that Question 141 had only one relevant document,
it would not be possible in this collection to obtain the theoretically
maximum figures for recall and precision beyond the single document
cut-off level. Similarly, there are thirteen questions which have more than
five relevant documents, and 100% recall could not possibly be obtained
until twelve documents have been retrieved, this number representing the
highest figure for documents relevant to a single question. This does not
affect the position of the lines, which would be different, however, for
other situations where there are more or less relevant documents per
question.
As previously mentioned, it is not possible to obtain the theoretically
maximum performance beyond the single document output cut-off, since Q141
has only one relevant document. As ten questions have only two relevant
documents, there must be a further deviation from the theoretical maximum
beyond this stage. In Fig. 5.5P is shown the actual possible maximum
performance that could be obtained with this collection. Achieving this
performance would imply that for each question all the relevant documents
were retrieved before any non-relevant documents were retrieved.
In Fig. 5.4P the lines radiating from the point of origin have been
based on the document output cut-off for thi[OCRerr] particular test situation,
but the performance curves could be drawn on a polar coordinate graph
with the lines radiating at regular intervals as in Fig. 5.6P. The original
purpose of using this type of graph was to investigate the possibility that