ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Evaluation of Document Retrieval Systems
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
[OCRerr]-17
The tradeoff, then, between precision and recall is a
necessary statistical consequence of using a meaningful matching
function. The nature of this tradeoff is fundamentally related to the
joint probability distribution of the user/system decisions from which
the conditional probabilities, recall and precision, are defined.
Improvements in retrieval systems which increase the joint probability
of relevance and retrieval will increase both recall and precision for
a given level of query-document association. For a given user, the
inverse relation of recall and precision influence the number of
2
output (retrieved) documents which it is useful for him to examine.
[OCRerr]. The Use of Optimal [OCRerr]ueries in Test Design
In Chapter [OCRerr] the notion of an optimal search request was
introduced and developed from the point of view of query modification
in a system environment allowing iterative searches, and real time
system-user interaction. It was noted there that the concept of an
optimal query offered the potential of allowing an explicit evaluation
of the power of[OCRerr]the index language independent of the performance
variations which can be expected from the query formulation process.
In essence, any evaluation measure based on a retrieval
operation with an optimal query is a measure of the relative
association between the members of a subset of relevant documents
(specified by the user) compared to the association of these documents
with the entire collection. Viewed in this manner, the definition of
an optimal query offers a positive alternative to the design of