CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Methods for presentation of results
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 76 -
collection, B2, is calculated to be 1027 documents, which means that
a further 116 documents must be deleted from collection B1. As will
be seen in Fig. 3.34T, the collections are now equated with the fallout
ratio being, in both cases, 9.278%.
Collection
No. of Generality Fallout Collection No. of Generality Fallout
Documents Number Ratio Documents Number Ratio
A
A1
200 23.6 9.278%
271 17.4 6.798%
B 1400 3.4 6.798%
B1 1143 4.1 8.333%
B2 1027 4.6 9.278%
FIGURE 3.34T
CORRECTED COLLECTION SIZES TO FIT GENERALITY
NUMBERS.
If instead of correcting collection B to collection A, the reverse step
had been taken, then it can be seen that it would have meant adding 71
documents to collection A making A1, which would then have a fallout
of 6. 798a/0, the same as the original collection B.
As a result of doing this, the precision ratio of collection A, can
now be converted by the equation given earlier, and, since recall, fallout
and generality are equal, the adjusted precision ratio must be 3.2% as
for collection B.
While the above may seem to be somewhat involved, it is, in fact,
a simplification of the real situation in that 42 questions have been
taken as a block. A more detailed analysis would require that each
question should be treated separately. Then, again, the analysis has
been done in a single fixed situation, namely a certain index language
at a certain level of coordination, and clearly it could be repeated over
many hundreds of situations of a similar type. However the implications
of such analysis are far-reaching, going beyond the scope of this ahapter,
so they will be considered later in this report.
In addition to explaining the performance measures adopted in this
report, this chapter has also attempted to cover, albeit in a non-exhaustive
manner, the main considerations regarding their use and effect. For
'ourselves, we feel that it is foolish, at the present stage of development,
to be dogmatic on this subject. Wherever it has been necessary to make
a choice between different methods, in most cases the decision has been
taken for reasons which could be considered peculiar to this project. Other