IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Evaluation Parameters
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
11-54
request set into equal or nearly equal groups according to the numbers of
[OCRerr]ocuments in the collection that are relevant. The comparison of the specific
and general request sets then involves a very large change in average gener-
ality, although the collection size is unaltered. To illustrate further the
problems caused by this type of comparison the set of 21 specific requests
will be compared with the 21 general requests in the Cran-l aerodynamics
collection, using the stem dictionary results.
Since the generality change suggests that fallout should be used in
place of precision, a fallout versus recall plot is given in Fig. 29(a).
Apart irom a slight crossing of the curves between .8 and .9 recall, the
specific requests are seen to have a superior performance, from the point of
view of system efficiency. The precision versus recall plot, however, will
reflect a direct performance comparison ignoring the generality change, so
a plot of this type is given in Fig. 29(b) where it is now seen that except
between .25 and .4 recall, the general requests have a superior performance.
It should be noted that a Pseudo-Cranfield type of cut-off is used here for
comparison of specific and general requests, since a plot of the Quasi-
Cranfield type as used in (A] give a large bias in favor of the specific
requests. This occurs because the specific requests all require greater
lengths of left en& extrapolation and the technique used for extrapolating
to 1.0 precision at 0.0 recall (method 3, part 4c, Fig. 20) gives the
specific requests falsely high precision values at low recall.
A partial explanation for the facts reflected in Fig. 29 is shown
by the data in Fig. 30. At each of the cutoff points shown, the general
requests produce a greater number of relevant and a smaller number of