IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Evaluation Parameters
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
II-[OCRerr]3
by the search results (o.[OCRerr] in case[OCRerr] b) and c)) is extrapolated linearly
to the 1.0 recall points, using the precision gained in the full search.
Since the full search curve is drawn by the Quasi-Cranfield cut-off method,
this means that cluster results are extrapolated to the precision achieved
by the last relevant document in the full search. Figure 25 a) shows what
happens to a cluster result in which no relevant documents at all are found:
using the left-end extrapolation method recommended in part 14C, the whole cluster
curve is an extrapolation from the chosen point at 1.0 recall in the full
search curve.
Extrapolation could also be done by assigning to those relevant
documents not found in the cluster search a random rank position, bounded by
the rank of the last document recovered by the cluster search and the total
collection size. It would be feasible also to extrapolate by use of the
precision achieved if the relevant documents not found were ranked in the
worst possible positions, that is, assuming that recall 1.0 is obtained only
as the last document in the collection is examined. A further suggestion is
to make use of the full search curve before it reaches 1.0 recall, and use
some method of joining the end of the cluster curve to some point along the
full search curve.
No comparison of these methods has yet been made, since the technique
in use is conceptually as satisfactory as any of the other suggestions.
5. Measures for Varying Relevance Evaluation
Although the rendering of relevance decisions is a task quite separate
from the considerations which go into the construction of performance measures
reflecting system effectiveness, it may be [OCRerr][OCRerr]sirable to use performarce measures
based on grades of relevance rather than on mary decision of I?relevantll
or "non relevant' alone. The performance characteristic curve suggested by
Giuliano and Jones [8] is designed to use spectra of relevance, since in