CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Simulated ranking and document output cut-off
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 220 -
Index Normalised Normalised
Language Recall Recall
Ratio Ratio
(basic) (weighted)
I.l .a 65.00 67.12
1.7 .a 64.05 65.94
III. 1 .a 61.76 63.64
III. 6. a 59.17 61.06
!I. 9. a 57 .i i 58.94
II. 5.a 55.05 57.11
FIGURE 5.25T
COMPARISON OF NORMALISED RECALL
RATIOS BY BASIC SCORING METHOD
{as Fig. 5.15T) AND BY WEIGHTED
SCORING METHOD FOR SIX INDEX
LANGUAGES.
retrieval would .be 26%. On the other hand, as was discussed earlier in this
chapter, the theoretical maximum performance cannot be achieved due to the
different numbers of relevdnt documents for each question, so the highest
possible normalfised recall ratio would be 86.70%.
It should also be emphasised that the normalised recall ratio only has
meaning within the context of the manner in which it has been calculated.
In this particular case it was by averaging the results of seventeen cut-off
groups as given on page 198. Assume that the number of groups had been
reduced to thirteen by combining the first six groups into two larger groups
covering documents ranked 51 - 100 and documents ranked 101 - 200. The
effect of doing this would be to reduce the ncrmalised recall ratio for index
language I.l.a from 65% to 55.7%. On the other hand, if the original groups
were broken down so that no groups had more than ten rankings, the
normalised recall ratio based upon the resul[OCRerr]Lng twenty-seven groups would
be 75.1%. At the same time, the effect of either of these actions would be
to change, as considered in the previous paragraph, the minimum figure
based on random retrieval and the maximum possible figure.