IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Evaluation Parameters
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
11-53
A final consideration for evaluation of operational tests pertains
to the appropriate measures to be used. Experimental tests of the SMART
system have so far measured the recall ratio on the basis of the total rele-
vant items in the collection. Although this accurately simulates users with
a high recall requirement, those users with a high precision requirement are
probably not too well served by the high precision end of the same curve.
The reason is that at least some users wanting high precision are not at
all concerned about getting high recall, and since they wish only to see,
say one, two, or three relevant items, they are clearly satisfied on the
recall side long before 100% recall of the total relevant items in the col-
lection is achieved. It is suggested that in semi-operational tests that
will be made in SMART in the future, a "Re3ative Recall" be computed:
Total Relevant Examined
Relative Recall =
Total Relevant User Would Like to Examine
This ratio is relative to user satisfaction rather than to toal system
resources. Several adjustments might be made for actual tests, since some
users would perhaps examine more relevant than they intended (1.5 recall
would not be very useful for evaluation purposes), and other users might
wish to see more relevant than were available [OCRerr]n the system at all (an
acquisitions, rather than retrieval failure).
8. The Comparison of Specific and General Requests and the Viewpoints
of the "higher precision and "high recall" user.
The comparison of a set of `specific' requests with a set of `general'
requests provides an environment of acute change in request generality. Iso-
lation of specific from general requests is carried out by dividing a given