Scientific Report No. IRS-13 Information Storage and Retrieval

IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Test Environment chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 1-45 documentation collection does follow the expected pattern if precision in subject terminology is of importance. Comparison with Fig. 12 reveals that request preparation is a large variable, and use of the 17 non-staff prepared requests would be expected to result in a curve lower than that for the Cran-l aerodynamics results. Another technique of collection comparison uses the average rank technique as used in part GB for comparing the specific and general requests. Fig. 23 gives results based on the stem dictionaries, and Fig. 24 gives results based on the thesaurus dictionaries. The average rank positions of the first and second relevant documents reflect the viewpoint of a user needing high precision. Ignoring differences in collection size, the Cran-l aeordynamics collection gives a good result using the thesaurus and on ADI the first relevant receives the best average rank Use of the percentage figure to take into account changes in collection size restores the ex- pected merit. Figs. 23 and 24 also record the average rank positions of the last relevant document to reflect the viewpoint of the high recall user. The averagerank is[OCRerr]directly affected and ordered by collection size, but the percentage figure shows that IRE-3 and Cran-l perform a little better than ADI.