Scientific Report No. IRS-13 Information Storage and Retrieval

IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Evaluation Parameters chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 11-61 This comparison of specific and general requests on behalf of both high precision and high recall users requires more experimental work, since it is expected that there may also be some correlation between user needs and request generality, with high precision users tending to pose general requests, and high recall users tending to pose specific requests. Further work is needed to develop methods of constructing plots of the precision versus recall type to represent high precision and high recall runs, both separately and in one combined plot. A suggestion for two individual plots is made in Fig. 34, where the Cran-l results are again used. Evaluation for the high precision user is made in Fig. 34 (a) by use of standard precision versus "relative recall" (defined in part 7), with relative recall here based on retrieval of just two relevant documents for each request. This plot assumes that the searches will make a final cutoff after the second relevant document is reached; for several reasons the totalling procedure is basically the pseudo- Cranfield one. Fig. 34(b) reflects the interests of the high recall user, and standard recall commences measurement on the plot at a recall of 0.7. This plot is less satisfactory than the other, because a search strategy and cutoff that would be adopted by a user wanting high recall is not easy to simulate. Thus a cutoff is established at the last relevant document using the Pseudo- Cranfield totalling method in order to show the maximum difference in precision that could occur at 1.0 recall comparing the specific and general requests, and assuring that an optimum cutoff were chosen. The suitability of the normalized measures for comparison of specific and general requests needs to be investigated further. Since the equations used both contain "N", the collection size, some allowance is made for gener- ality, and in seven out of eight cases observed so far, both the normalized