Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2

CRANV2 Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2 Methods for presentation of results chapter Cyril Cleverdon Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. - 76 - collection, B2, is calculated to be 1027 documents, which means that a further 116 documents must be deleted from collection B1. As will be seen in Fig. 3.34T, the collections are now equated with the fallout ratio being, in both cases, 9.278%. Collection No. of Generality Fallout Collection No. of Generality Fallout Documents Number Ratio Documents Number Ratio A A1 200 23.6 9.278% 271 17.4 6.798% B 1400 3.4 6.798% B1 1143 4.1 8.333% B2 1027 4.6 9.278% FIGURE 3.34T CORRECTED COLLECTION SIZES TO FIT GENERALITY NUMBERS. If instead of correcting collection B to collection A, the reverse step had been taken, then it can be seen that it would have meant adding 71 documents to collection A making A1, which would then have a fallout of 6. 798a/0, the same as the original collection B. As a result of doing this, the precision ratio of collection A, can now be converted by the equation given earlier, and, since recall, fallout and generality are equal, the adjusted precision ratio must be 3.2% as for collection B. While the above may seem to be somewhat involved, it is, in fact, a simplification of the real situation in that 42 questions have been taken as a block. A more detailed analysis would require that each question should be treated separately. Then, again, the analysis has been done in a single fixed situation, namely a certain index language at a certain level of coordination, and clearly it could be repeated over many hundreds of situations of a similar type. However the implications of such analysis are far-reaching, going beyond the scope of this ahapter, so they will be considered later in this report. In addition to explaining the performance measures adopted in this report, this chapter has also attempted to cover, albeit in a non-exhaustive manner, the main considerations regarding their use and effect. For 'ourselves, we feel that it is foolish, at the present stage of development, to be dogmatic on this subject. Wherever it has been necessary to make a choice between different methods, in most cases the decision has been taken for reasons which could be considered peculiar to this project. Other