IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Test Environment
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
1-9
in Fig. 5. In virtually every case, the entire collections have been
examined for relevance in relation to every search request. The only
exceptions to this are four requests used in IRE-3 that were based on the
classification headings. For these requests those documents in the IRE-l
part of the collection originally classified under the given headings were
taken to be relevant1 and no other documents in the collection were examined.
In every case the request preparer made the relevance decision, and
in a few cases, a consensus of opinion was used for cases of doubt for one
or another of the seventeen staff prepared IRE-3 requests. Doubt in relevance
decisions was usually settled by accepting the document as relevant. Dichotomous
decisions only were made for the IRE-3 and ADI requests: a document was re-
garded either as relevant or non-relevant with no grades of relevance allowed.
In the Cran-l case, a scale of four degrees of relevance was used for the
relevance judgments. In the experiments conducted so far with the SMART
system, all four degrees of relevance were regarded as equally relevant. A
small hand-calculated set of results taking into account these available rele-
vance grades is presented in part 5 of this section.
Relevance decisions in the IRE-3 collection were always made by
examining the document abstracts and never the full texts. [OCRerr] be re-
garded as a weakness of this environment. A detailed examination of the
relevance decisions for the ADI set is made in Section X, part 4. Whether the
prepared requests and relevance decisions of the IRE-3 and ADI collections and
even the author supplied data in the Cran-l collection are typical of real-life
situations is a disputed question. So far, no evidence has been produced to
invalidate the methods used. Examination of relevance decisions on the three
collections leaves the impression that the Cran-l requests, which come closest