CRANV2 Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2 Methods for presentation of results chapter Cyril Cleverdon Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. - 74 - SYSTEMS DATA No. of documents No. of questions Total No. of relevant documents Generality Number Collection A Collection B 200 1400 42 42 198 198 23.6 3.4 PERFORMANCE AT COORDINATION LEVEL OF 3 Collection A Collection B Relevant retrieved 132 132 Non-relevant retrieved 761 3,984 Recall Ratio 66.7% 66.7% Precision Ratio 14.8% 3.2% Fallout Ratio 9. 278% 6. 798% FIGURE 3.33T SYSTEMS AND PERFORMANCE DATA FOR COMPARISON OF GENERALITY NUMBERS. If the fallout in both collections were exactly the same, this would mean that the ratio of the change of the number of non-relevant retrieved (b) would be the same as the ratio of the change of the total non-relevant (b + d) i, e. b(Colleetion B) (b + d)(Collection B) = b(Collection A) (b + d)(Collection A) b(Collection B)/b(Collection A) (b + d)(Collection B)/(b + d)Collection A = 1 Bearing in mind that these figures represent the sum of a series of searches for 42 questions having 198 relevant documents, the result from Fig. 3.33T is, in fact, 3984 761 (42 x 1400) - 198 (42 x 200} - 198 5. 2352 7. 1448 - 0.7327 It is therefore shown that b(non-relevant retrieved) has increased by a factor of 5. 2352 while the total number of non-relevant documents (b + d) has increased by a factor of 7. 1448. Proof of the accuracy of this can be shown by assuming that collection B had retrieved 7. 1448 times as many non-relevant documents as collection A in which case it would have retrieved 761 x 7. 1448 [OCRerr]' 5437 documents, as against the actual total of 3,984 documents.