CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Methods for presentation of results
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 36 -
a +b
complementary to precision ratio.
'Noise Factor'.
Called by Perry,
b
b+ d
d
b+ d
here called Fallout Ratio.
complementary to fallout ratio.
University, ,Specificity'.
Called by Western ]Reserve
Use of any of these single measures, either reflecting the retrieval
of the relevant items or the retrieval of non-relevant items, is inadequate
to reflect the performance of a system. High recall can mean very low
precision, or vice versa, and the mere statement that the recall ratio
is 99% means little, for it might only be achieved by retrieving more than
half of the total collection.
While many different combinations of single measures have been
proposed, they fall into two groups: 'twin variable measures' and 'composite
measures'.
For the former, one of each of the single measures is taken and a
comparison made between them by observing the relative changes in the
two values, but retaining each value as a separate entity. The two major
pairs of single measures are recall with precision and recall with fallout.
Examples of recall/precision ratios are given in Figs. 3.3 and
3.4. Fig. 3.3T illustrates the situation for a set of 20 searches where
the variable being tested is the search coordination level, that is the
number of search terms which must be matched with the index terms. At
each different level, a cut-off is applied and the number of documents retrieved,
relevant and non-relevant, is recorded. Since the total number of
relevant documents is known, the recall and precision ratios can be
calculated, as shown in the table. Alternatively these, ratios can be
plotted as on the graph (Fig. 3.3P) with the five performance points
connected to make a recall/precision curve. In Fig. 3.4T are given the
results of a series of searches with the same set of questions but with
different search requirements. The particular change is incidental to the
present discussion, but in fact whereas search X accepted any combination
of terms, search Y would not accept certain terms unless some other
given term was also present. (This matter of search strategy was
discussed in Chapter 2). The result of this change was a different set
of performance figures at the five coordination levels. The contrast
between search X and search Y can be seen by comparing the tables or
from the graph (Fig. 3.4P), which shows clearly that the maximum recall
figure has fallen sharply in search Y, but on the other hand at any given
recall ratio of 65% or less, search Y will give a higher precision ratio
than search X.