IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Test Environment
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
1-31
experimental tests. An examination in more detail of the documentation
requests appears in Section X, parts 3 and 4.
One further performance comparison appears in Fig. 14. Here the
17 staff-prepared requests are searched on the two different collections,
the only variation being that the relevance decisions for the IRE-2
collection were made much later in time than those on the IRE-l collection.
The mean relevant per request in the IRE-l collection is 10.9, and in the
IRE-2 collection 10.6, implying only a small change in generality (27.0 on
IRE-l, and 28.3 on IRE-2). The small difference in performance observed
must be due in part to the fact that relevance decisions by the same in-
dividual are not entirely consistent over periods of time, and also be-
cause the IRE-2 collection may be more hostile to good retrieval (there may
be more marginally relevant or falsely matched documents)
B) Specific and General Requests
Data given in Fig. 3 divides up the request sets into specific and
general according to the numbers of relevant documents in the collection.
A performance comparison of specific with general requests raises some quite
complex evaluation problems which are discussed in Section II. `Because
no complete solution to these problems has yet been found, a reasonably simple
presentation will be given.
Fig. 15 is a simplified representation of nine comparisons made
between four sets of specific and general requests. The first request set is
from the IRE-l collection using the 17 staff-prepared requests, since
this result has appeared previously (15]; the three other request sets are
the IRE-3, Cran-l and ADI sets which are now used for test purposes.