CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Conclusions
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
262 -
While this particular result has obviously been prepared to illustrate
the point, it would seem that this is an example of what has been consistently
happening in the test searches with the Single Te/'m and Controlled Term
index languages. Whereas the broadening of the term classes has increased
the recall of relevant documents at higher coordination levels, the effect of
-doing this has been more than offset by the increased number of non-relevant
documents. Only when the index terms being used are too precise, as in the
case of the Simple Concept Natural Language, can the formation of broad
classes of terms bring about an improvement.
Finally, it is necessary to consider the measures which have been
used in this test, and to ask whether it is possible that some other measures
would have brought about a change in the comparative results. Obviously
suspect is the normalised recall ratio, based on a simulated rank output.
While at first it might seem that such a measure is likely to weigh in favour
of systems having high recall ratios, it is in fact mainly influenced by the
first two ranked documents. At this stage, the recall ratios, as can be seen
from Figures 5.11T - 5.14T, are as follows
Recall Ratio at
Document Output
Cut-off of 2
Index Language
23% 1.2, 1.3
22% 1.1
21% 1.6, 1.7
20%
19% 1.5, 1.8, II.9, III. 2
18% II.3, II.12, IV.3, IV.4
17% II.10, III. 1, IV.l, IV.2
16% 1.9, II.11, III.3, III.4
15%' II. 5
14% II. 13
13% II.2, II.8, III. 5
12% II.1, II.4, II.6, III.6
11% II.7, II.15
10%
9% II. 14
It will be seen that with the exception of Index Language II.3, which(at 18%)
rises from 28 to 10=, there is a strong correlation between this ordering and
the final ordering as given in Table 8.1. With the document output cut-off
method, recall and precision are, as we explained earlier, completely
interdependent, and therefore it would appear to be a measure that is quite
impartial as between recall and precision, it is known that others are
investigating different measures, and most of those that have been proposed
have already been considered in Chapter 3. Now that the results of this
test are available, it is to be hoped that proponents of" new measures will
be able to demonstrate any superiority over those used in this report.
Until such time, there appears to be no reason to suggest that the measures
have affected the comparative results.
With the possible doubtful exception of the subject field, there appears
to be nothing in the test environment which could be held responsible for
serious distortion of the results as between one system and another. There-
fore it is necessary to proceed on the assumption that the results are