CRANV2 Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2 Methods for presentation of results chapter Cyril Cleverdon Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. -73 - presentation of performance. It is therefore necessary to make adjustments to the precision ratios in certain situations (which have been considered in Chapter 2) where sets of varying generality have to be compared. This is reasonably straightforward and is obtained by the following equation:- PA (Adjusted Precision Ratio) = R1 x G2 (1[OCRerr]1 x G2) + FI(1000 - G2) where R1 = Recall ratio obtained for a given system, in a situation of a known generality number F1 = Fallout ratio obtained for the given system, in a situation of a known generality number G2 = Generality number to which it is desired to alter the results, to obtain the adjusted precision Thus two sets of performance figures obtained with systems-of differing generality can be compared by adjusting the precision ratio of one case, so that it is based on the generality number of the other. If the example in Fig. 3.32T were to be corrected, and if it were decided to alter the result of Collection A to fit the generality of Collection B, then, from the equation given above, .50 x 1 .50 PA = = = . 048 (.50 x 1) + .01(1000 - 1) .50 + 9.99 The answer, expressed as a percentage is 4.8% and this result is clearly correct, with both cases now having an identical recall ratio, fallout ratio and precision ratio, This however, is a simplified example, and in practice the matter is complicated by what at present seems to be the most difficult problem in performance comparison, namely the determination of the correct N. (the size of the collection). To consider this, an actual result is taken from a particular set of 42 questions that were searched on collections A and B where N equals 200 and 1400 documents "respectively, the documents in collection A being'a subset of the documents in collection B. The details are given in FL.g. 3.33T, with the two sets of performance figures obtained in exactly the same conditions. While the precision ratio for collection A has increased with the increased generality number, yet there is also a significant difference in the fallout ratio.