IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Suffix Dictionaries chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. vI-~ 3. [OCRerr]etrieval Performance [OCRerr]esults Comparisons of the suffix I[OCRerr]I and stem dictionaries are presented for the three document collections, using the normalized measures, precision versus recall graphs and data from individual requests. Figure 1 gives ten results using the normalized recall and precision measures The ADI results include text, abstract and title results, and some results are displayed both for the ADI and IRE-3 collections with overlap correlation and logical vectors. All IRE-3 results and four of the six ADI results show the stem dictionary to have higher normalized values, although by quite small amounts. The single Cranfield result and the ADI text cosine and overlap logical runs show suffix 1g1 to be the superior dictionary. Four results are given using precision versus recall graphs: II[OCRerr]E-3 Figure 2(a), Cran-l Figure 2(b), ADI Abstracts Figure 3(a) and ADI Text Figure 3(b). These results confirm those in Figure 1, and the Cran-l result is seen to favor suffix 151 over the whole range of the curve. To complete all the runs given in Figure 1 in terms of precision and recall, a table is given in Figure [OCRerr] that summarizes six more precision/recall plots not presented in detail, by recording the precision merit at three levels of recall. Some disagreement between these results and the normalized measures may be noted, and the reasons tor this are discussed in section II. The cases of disagreement all consist of very small differences in merit between suffix 151 and stem, and all the more valuable comparisons which use the cosine correlation and numeric vectors display consistent results. The aver- age performance measures show, therefore, that stem is superior to suffix `5' on the IRE-3 and ADI collections, and that suffix `5' is the better diction- ary on the Cran-l collection.