IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Evaluation Parameters
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
11-61
This comparison of specific and general requests on behalf of
both high precision and high recall users requires more experimental work,
since it is expected that there may also be some correlation between user
needs and request generality, with high precision users tending to pose
general requests, and high recall users tending to pose specific requests.
Further work is needed to develop methods of constructing plots of the precision
versus recall type to represent high precision and high recall runs, both
separately and in one combined plot. A suggestion for two individual plots is
made in Fig. 34, where the Cran-l results are again used. Evaluation for the
high precision user is made in Fig. 34 (a) by use of standard precision versus
"relative recall" (defined in part 7), with relative recall here based on
retrieval of just two relevant documents for each request. This plot assumes
that the searches will make a final cutoff after the second relevant document
is reached; for several reasons the totalling procedure is basically the pseudo-
Cranfield one. Fig. 34(b) reflects the interests of the high recall user, and
standard recall commences measurement on the plot at a recall of 0.7. This
plot is less satisfactory than the other, because a search strategy and cutoff
that would be adopted by a user wanting high recall is not easy to simulate.
Thus a cutoff is established at the last relevant document using the Pseudo-
Cranfield totalling method in order to show the maximum difference in precision
that could occur at 1.0 recall comparing the specific and general requests, and
assuring that an optimum cutoff were chosen.
The suitability of the normalized measures for comparison of specific
and general requests needs to be investigated further. Since the equations
used both contain "N", the collection size, some allowance is made for gener-
ality, and in seven out of eight cases observed so far, both the normalized