IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Evaluation Parameters
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
11-46
where n = number of relevant documents
number of documents in collection
th
rank of i relevant document
1
w. = weight score derived from relevance grade of
1
[OCRerr]th relevant document
This equation therefore uses the sum of the products of the ranks and the
weight scores of the relevant documents, rather than the sum of the ranks
alone as in conventional normalized recall. Some examples given in Fig. 27
will clarify the use of this measure. Fig. 27(a) illustrates a perfect case,
where the four relevant documents are given relevance grade weights of 4
(most highly relevant), 3, 2, and 1 (least relevant). Performance in rank
position is perfect, as is the order in which the relevant documents are
ranked, so a weighted normalized recall of 1.0 results. Fig. 27(b) and Cc)
show cases of less than optimum relevance grades and ranks, respectively,
although both have equal merit in weighted normalized recall. This illus-
trates the fact that a different range of weights assigned to the relevance
grades could be used to adjust the relative effect of ranking and relevance
grade ordering. An actual result is given in Fig. 27(d).
6. Measures for Varying Generality Comparisons
The generality number defined in part 2 reflects the concentration
of relevant documents in a given collection. From a user viewpoint, the
greater the number of relevant documents in a system, the higher probability
there is of finding relevant documents at a given cut-off point. Comparing
the ADI and Cran-l collections, for example, although the average request has