IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Test Environment chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 1-31 experimental tests. An examination in more detail of the documentation requests appears in Section X, parts 3 and 4. One further performance comparison appears in Fig. 14. Here the 17 staff-prepared requests are searched on the two different collections, the only variation being that the relevance decisions for the IRE-2 collection were made much later in time than those on the IRE-l collection. The mean relevant per request in the IRE-l collection is 10.9, and in the IRE-2 collection 10.6, implying only a small change in generality (27.0 on IRE-l, and 28.3 on IRE-2). The small difference in performance observed must be due in part to the fact that relevance decisions by the same in- dividual are not entirely consistent over periods of time, and also be- cause the IRE-2 collection may be more hostile to good retrieval (there may be more marginally relevant or falsely matched documents) B) Specific and General Requests Data given in Fig. 3 divides up the request sets into specific and general according to the numbers of relevant documents in the collection. A performance comparison of specific with general requests raises some quite complex evaluation problems which are discussed in Section II. `Because no complete solution to these problems has yet been found, a reasonably simple presentation will be given. Fig. 15 is a simplified representation of nine comparisons made between four sets of specific and general requests. The first request set is from the IRE-l collection using the 17 staff-prepared requests, since this result has appeared previously (15]; the three other request sets are the IRE-3, Cran-l and ADI sets which are now used for test purposes.