ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Evaluation of Document Retrieval Systems
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
5-2[OCRerr]
D >D >... >[OCRerr]
II
1 2
where D[OCRerr][OCRerr]E D[OCRerr] and >implies "more relevant than.1' In this case the
1
objective of a retrieval operation may be defined as follows: a
retrieval operation with respect to a query q and a partially ordered
set of relevant documents D is[OCRerr] expected to produce an ordering on the
reference collection D., such that every member of the set D is ranked
1
before all members of the sets D for which [OCRerr]< k, and that all
members of D are ranked before members of D Corresp6nding to this
definition,. expressions for r*(x), p*(x), r (x), and p (x) may be
q q
defined in a manner analogous to those previously used. The develop-
ment of the performance indices for this case is[OCRerr]more cumbersome than
for the case presented above. As the situatiQn to which these extended
indices are applicable is not normally considered to be of general
interest their derivation is omitted.
[OCRerr]. Experimental Use
The performance measures developed in this section have been
used to evaluate the results of a variety of experiments conducted with
the SMART system.11 ,12 As one might expect from the formulation, the
range of the normalized recall index is rather limited;[OCRerr].i.e. a random
retrieval yields an expected recall index of .[OCRerr], hence one would
suspect results observed in practice to be close to 1.0. In fact, the
observed range of this index from a variety of SMART system experiments
is from about .[OCRerr] to 1.0, with an average near .[OCRerr]7. The normalized pre-
cision index however, being more dependent on the initial part of the