IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Test Environment
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
of any major SMART experiment so far, but an attempt to make such a comparison
is contained in part 6 of this section. The degree to which the documents
and requests used in this laboratory environment may be regarded as typical
of larger sized real-life situations is not known. What is certain however,
is that all the document collections are almost certainly contained in part
if not in whole within actual collections being used, and there is nothing in
the stated requests that suggests that they could not be posed in real-life
situations.
Further collections and requests have been obtained for the purpose
of making additional tests, as outlined in [2]. Fig. 2 supplied some tentative
data on four new test environments that are currently under investigation.
3. Relevance Decisions
Data on the number of documents assessed as relevant is given in Fig. 3.
The division into specific and general requests is made by dividing each request
set into two equal or nearly equal sets according to the number of documents
assessed as relevant. This method is therefore highly dependent on the
characteristics of the test environment, but it permits a comparison of requests
of differing generality, see part 6B.
The data in Fig 4 shows the extent to which the requests cover the
topic areas of thetotal collection. Between 55% and 88% of the documents in
the collections are relevant to one or more of the requests; it may thus be
assumed that most of the major collection topic areas are covered by one or
more requests.
The techniques used for obtaining the relevance decisions are given