IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Evaluation Parameters chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. II-2~ and no account is taken for cut-off purposes of the correlation, although one study using correlation magnitudes has been made [10]. using the precision recall pairs that can be computed as each document in the output list is examined ([OCRerr]igure [OCRerr]), three cut-off methods seem feasible. The first method is to obtain average curves from all requests just as drawn in Figure 14, by computing mean precision recall pairs for each document cut- off level. If done by hand, the cut-off points may be recorded on the curve as in Figure 114 a), or a computer-produced average may be used which produces precision at ten recall levels for plotting convenience, Figure i14 b). This technique is referred to as the 11pseudo-Cranfield'1method, and although it is available for many runs it is not generally used for SMART evaluations. One advantage of this method is that is seems to be fully user-oriented, since the plot of Figure 114 a) shows how many documents a typical user must examine to get IXI? recall. Another advantage is that computation does not depend on the interpolation and extrapolation techniques that are required for the other methods to be described. A disadvantage stems from the fact that the re- quests vary according to the number of relevant items so that if one of the requests has only a single relevant document, any cut-off made at 2 or more documents will not give 1.0 precision even if all requests have a quite per- fect performance. One simple solution to this is to give the theoretical best possible curve for a given set of requests, as is done in Figure 114 a). It is a simple matter to use this cut-off method with macro evaluation, as the macro curve in Figure U was obtained this way. The second and third cut-off techniques use, respectively, precision and recall ratios to determine the cut-off points at which averages will be computed. A set of precision or recall values are picked in advance, and requests are averaged essentially at the cut-off points at which the required