CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Methods for presentation of results
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 74 -
SYSTEMS DATA
No. of documents
No. of questions
Total No. of relevant documents
Generality Number
Collection A Collection B
200 1400
42 42
198 198
23.6 3.4
PERFORMANCE AT COORDINATION LEVEL OF 3
Collection A
Collection B
Relevant retrieved 132 132
Non-relevant retrieved 761 3,984
Recall Ratio 66.7% 66.7%
Precision Ratio 14.8% 3.2%
Fallout Ratio 9. 278% 6. 798%
FIGURE 3.33T
SYSTEMS AND PERFORMANCE DATA FOR COMPARISON
OF GENERALITY NUMBERS.
If the fallout in both collections were exactly the same, this would mean
that the ratio of the change of the number of non-relevant retrieved
(b) would be the same as the ratio of the change of the total non-relevant
(b + d) i, e.
b(Colleetion B) (b + d)(Collection B)
=
b(Collection A) (b + d)(Collection A)
b(Collection B)/b(Collection A)
(b + d)(Collection B)/(b + d)Collection A
= 1
Bearing in mind that these figures represent the sum of a series of searches
for 42 questions having 198 relevant documents, the result from Fig. 3.33T
is, in fact,
3984
761
(42 x 1400) - 198
(42 x 200} - 198
5. 2352
7. 1448
- 0.7327
It is therefore shown that b(non-relevant retrieved) has increased by a
factor of 5. 2352 while the total number of non-relevant documents (b + d)
has increased by a factor of 7. 1448. Proof of the accuracy of this can be shown
by assuming that collection B had retrieved 7. 1448 times as many non-relevant
documents as collection A in which case it would have retrieved 761 x 7. 1448
[OCRerr]' 5437 documents, as against the actual total of 3,984 documents.