IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Evaluation Parameters chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 11-44 their view the usual precision and recall can only be used in situations where relevance decisions are bI[OCRerr]ck or white. An Example of a performance characteristic curve using relevance grades is given in Figure 26(a). The Cran-l collection is used because grades of relevance on a scale of four are available for these relevance decisions; thus a "point score11 is assigned to those requests, giving a score of four to the most relevant documents, three to the next, and two and one to the final two grades. Figure 26(a) then uses these cumulated relevance points on the y axis as indicating a type of recall, and uses rank positions (cut-off ratio) on the x axis. Two dictionaries are compared, and the best possible performance curve is displayed. However, as has been demonstrated in [2], it is not correct to assume that precision and recall are incapable of handling relevance grades. Figure 26(b) uses the same data and displays two precision recall graphs, where recall is based on the relevance points score rather than on the more usual document score. In fact, the merit of the two options compared is quite identical - and must be so mathematically so that the curves cross at the same point; furthermore, the rank position value can be indicated on the precision recall graph as shown. The performance characteristic curve does not give any directly visible infor- mation about the amount of non-relevant material being retrieved; the conclusion is then that precision is of value here. Additional precision recall graphs based on relevance grades are given in Section I of this report. It is also a quite simple matter to modify the single number measures to incorporate grades of relevance. For example, using the normalized recall measure, a "Weighted Normalized Recall" may be defined: Weighted Normalized Recall = 1 - n 11 ____ i=l n(N-n)