IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Test Environment chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 1-24 For example, the Cranfield Project results are based on the 1,400 collec- tion; with the 200 collection in use, the optimum dictionary contains ap- proximately 1,300 concepts. 5. Relevance Grade Test Results It was noted in part 3 th&t in the case of the Cran-l collection, relevance decisions are available that reflect degrees of relevance as judged by the persons supplying the requests. Since all SMART tests made so far have not used these different relevance grades, a brief examination of the relevance grade is made here. It seems reasonable to postulate that the four grades of relevance produce different types of difficulties in achieving a good retrieval performance. specifically the documents graded most highly relevant probably achieve high rank positions on the output list,[OCRerr]d those documents graded as of very minor relevance may have low rank positions in the search output. One method of analysis that may show whether this does occur is illustrated for a single request in Fig. 10. The ranks of the seven rele- vant documents are given for the actual search result, usjngthe Cran-l collection and the suffix [OCRerr]s' dictionary. For each relevant document, a relevance grade score is given, with the most highly relevant documents scoring 4 the next most relevant 3, then 2 and finally 1. If the expected result is achieved, the relevant documents with a grade score of 4 will be marked higher than those of 3, and so on. To test this, two other theo- retical results are recorded in Fig. 10, including one for which the relevance grade scores follow the postulated pattern (described as