ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval Synopsis synopsis Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. the IBM 7094 producing several classifications of a collection of 405 document index images (incorporated into the SMART system). Some experimental results illustrating the use of the classifications for improving search efficiency are presented. Document classification of the type described is novel, in that it is proposed as a direct adjunct to the query-document matching operation. Normally, automatic classification algorithms have been considered as replace- ments for manual classification or indexing. The statistical basis for the evaluation of document retrieval systems is discussed in [OCRerr]hapter 5. Several of the topics considered ar& based on previous work which is cited by bibliographic reference. The organlzation and presentation as well as some of the. conclusions drawn' are original. In addition some novel performance statistics are derived which are particularly applicable to query- document matching operations possessing a high degree .of'discrimina- tion such as the' correlation measure of the assumed model. Each Qf the statistic's derived is capable of describing overall system performance with a single `parameter in contrast with several of the evaluation measures in curre'nt'use. xvii