CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Supplementary tests and results
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
221 -
CHAPTER 6
Supplementary tests and results
Any social agency has a duty to study and evaluate its effect-
iveness and to seek continuously to improve the methods it
employs to achieve its objectives. It is not enough to believe,
however sincerely, that we are doing good. it is not enough
to invoke 'experience, or to collect meaningless and mislead-
ing information... It is not enough to rely upon the support
of colleagues and those in the same professional group and
to accept their endorsement of our work as proof of its
effectivenss. Professional in-group support does not measure
effectiveness and does not absolve us from accountability for
our decisions. The effectiveness of social agencies, it is
claimed, is a question to be determined empirically by
methods which can be repeated and verified by others.
L.T. Wilkins: Social Deviance, pages 5 and 6
Whereas in the preceding chapter, the main test results were considered
on the basis of the document output cut-off method, with normalised recall
ratios, we now return to the basic method used in Chapter 4, and present a
series of mainly disconnected notes on various supplementary matters. In
some cases, new data are presented; in other cases data which have already
been given in Chapter 4 is brought together in different ways in order to
illustrate more effectively certain points.
Comparative Results
It is difficult to make direct comparison between the main index
languages, because of the inevitable variations created by different numbers
of starting terms. However, Fig. 6.1P shows the performance curves for
Single Term Natural Language (I. 1.a), Simple Concept Natural Language
(II.l.a) and Controlled Term, Basic Terms (Ill. 1.a). These might be
considered to be comparable since they are all concerned with the basic
terms in the particular vocabulary, but the inability of the Simple Concept
Index Language to obtain a higher recall figure than 36.9% is due to the
severe restrictions which interfixing imposes. That the Controlled Term
Index Language also suffers a drop, as compared to Single Term Index
Languages, of 7.6% in maximum recall is for the same reason, but the
effect is not so severe in this case, since fewer single terms are interfixed.
In general the Single Term Natural Language appears to give the best perfor-
mance.
More reasonable is to make comparison between the index language
which have the highest normalised recall ratios in each of the three main
groups. These would appear to be Index Languages 1.3.a {Single term. Word
forms), H.10.a, {Simple Concept. Second alphabetical collateral selected),
and III.2.a, (Controlled term. Narrower terms). The results are given in
Fig. 6.2P, and show that the Simple Concept index language has made a
large increase in maximum recall, but again the Single Term index language
appears to give the best performance over the whole curve, thus bearing out
the figures presented in Chapter 5.