IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Test Environment
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
1-24
For example, the Cranfield Project results are based on the 1,400 collec-
tion; with the 200 collection in use, the optimum dictionary contains ap-
proximately 1,300 concepts.
5. Relevance Grade Test Results
It was noted in part 3 th&t in the case of the Cran-l collection,
relevance decisions are available that reflect degrees of relevance as
judged by the persons supplying the requests. Since all SMART tests made
so far have not used these different relevance grades, a brief examination
of the relevance grade is made here.
It seems reasonable to postulate that the four grades of relevance
produce different types of difficulties in achieving a good retrieval
performance. specifically the documents graded most highly relevant
probably achieve high rank positions on the output list,[OCRerr]d those documents
graded as of very minor relevance may have low rank positions in the search
output. One method of analysis that may show whether this does occur is
illustrated for a single request in Fig. 10. The ranks of the seven rele-
vant documents are given for the actual search result, usjngthe Cran-l
collection and the suffix [OCRerr]s' dictionary. For each relevant document, a
relevance grade score is given, with the most highly relevant documents
scoring 4 the next most relevant 3, then 2 and finally 1. If the expected
result is achieved, the relevant documents with a grade score of 4 will be
marked higher than those of 3, and so on. To test this, two other theo-
retical results are recorded in Fig. 10, including one for which the
relevance grade scores follow the postulated pattern (described as