CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Main test results
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 81 -
It will be seen from Figure 4.001 that, at a coordination level of 6,
only 161 of the possible 164 questions were searched, and from this
stage there is an increasing disparity between the figures in column
y and z.
Column x shows the number of questions that were able, at
any given coordination level, to retrieve any documents, whether
relevant or non-relevant. It will be noted, for instance, that
it was not until the coordination level had dropped to ten terms
was it possible to retrieve a single document; at this level, as can
be seen from the figures in columns x and z, of the 52 questions having
ten or more starting terms, 6 questions retrieved documents.
(11) In the columns of documents retrieved are shown the total
numbers of relevant and non-relevant documents retrieved at the
various coordination levels. These figures have been obtained by
summing the results for each individual question in the question
set.
As mentioned in the previo[OCRerr],s section, in some cases the
searches were not completed at the lower coordination levels. The
result is that the figures for non-relevant documents retrieved are
estimated by a method described on page 28. All such estimated
figures are indicated by as asterisk. However, it should be noted
that the figures for the retrieval of relevant documents are always
correct.
(12) The actual performance measures presented in the tables are
recall ratio, precision ratio and fallout ratio. These are derived
from the following:-
a. Relevant documents retrieved
b. Non-relevant documents retrieved
c. Relevant documents not retrieved
d. Non-relevant documents not retrieved
Recall ratio is 100a that is relevant documents retrieved over
a + C '
the total relevant. All such figures in the tables are correct, but
( 100a :
for precision ratio [OCRerr]1 and fallout ratio (100b)
, it was in some
cases necessary to use the estimated figures discussed in the previous
section. Where this has been done, an asterisk is placed against
the figure in the table.
TABLE OF RESULTS
The tables of results are presented in nine main sections. Details
are given before each section of tables, but the following is a brief
resume.