IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Evaluation Parameters chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 11-53 A final consideration for evaluation of operational tests pertains to the appropriate measures to be used. Experimental tests of the SMART system have so far measured the recall ratio on the basis of the total rele- vant items in the collection. Although this accurately simulates users with a high recall requirement, those users with a high precision requirement are probably not too well served by the high precision end of the same curve. The reason is that at least some users wanting high precision are not at all concerned about getting high recall, and since they wish only to see, say one, two, or three relevant items, they are clearly satisfied on the recall side long before 100% recall of the total relevant items in the col- lection is achieved. It is suggested that in semi-operational tests that will be made in SMART in the future, a "Re3ative Recall" be computed: Total Relevant Examined Relative Recall = Total Relevant User Would Like to Examine This ratio is relative to user satisfaction rather than to toal system resources. Several adjustments might be made for actual tests, since some users would perhaps examine more relevant than they intended (1.5 recall would not be very useful for evaluation purposes), and other users might wish to see more relevant than were available [OCRerr]n the system at all (an acquisitions, rather than retrieval failure). 8. The Comparison of Specific and General Requests and the Viewpoints of the "higher precision and "high recall" user. The comparison of a set of `specific' requests with a set of `general' requests provides an environment of acute change in request generality. Iso- lation of specific from general requests is carried out by dividing a given