ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Design Criteria for Automatic Information Systems chapter M. E. Lesk G. Salton Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. v2~ are identified. Specifically, a correlation coefficient is computed to indicate the degree of similarity between each document and each search request, and documents are then ranked in decreasing order of the correlation coefficient. C3,[OCRerr],5] A typical search request processed by the system is shown in Fig. 1. Three analyzed forms of this request, produced respectively by a word stem identification process (nuil thesaurus), a synonym dictionary look-up (regular thesaurus), and a phrase identification method (statistical phrases), are shown in Fig. 2. Finally a typical output product listing documents in decreasing correlation order with the request is shown in Fig. 3. The system may be controlled by the user in that a search request can be processed first in a standard mode. The user can then analyze the output obtained and depending on the information returned to the system as a result of previous search operations, the request can be reprocessed under altered conditions. The new output can again be examined, and the search can be interated until the right kind and amount of infor- mation are obtained.[6,7] The SMART systems organization m&kes it possible to evaluate the effectiveness of the various processing methods by comparing the output obtained from a variety of different runs. This is achieved by processing the same search requests against the same document collections several times, while making selected changes in the analysis procedures between runs. By comparing the performance of the search requests under different processing conditions, it is then possible to determine the relative effectiveness of the various analysis methods. The actual evaluation calculations are based on the standard recall