ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Design Consideration for Time Shared Automatic Documentation Centers chapter M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. X-5 3. Methods We can draw on the experience obtained by using the SM£[OCRerr]T project to select the processing methods which should be used in the planned system. The results of the SM[OCRerr]RT project on the relative values of various retrieval methods are developed elsewhere, Li] and only a brief summary of some of the relevant points is given here. For input p1arposes, the best compromise between economy of space and quantity of information is probably the document abstract. Since most scientific journals require author abstracts, it should not be difficult to obtain a set of abstracts for the document collection being searched. The search procedure should be based on the use of a thesaurus vTith phrases. In. past experiments -[OCRerr]th the S[OCRerr][OCRerr]RT system, this method has been found to offer the best performance of any method tested on most collections. This method e[OCRerr]ibits the additional advantages of simplicity and flexibility. Specialized thesauri can be constructed for individual needs. Isolated errors are easily corrected. [OCRerr]xtensions of different languages and adaptations to different subject areas are possible. On the other hand, statistical procedures for automatic synonym detection, are relatively fixed procedures for which adjustments are[OCRerr]more difficult to mai[OCRerr]. It is not clear how such methods can be extended to different languages. Finally, automatic synonym detection is found experimentally to produce results inferior to those obtained by proper thesauri. Hierarchies also produce inferior results. Based on past SMART experience, we accept as our basic content analysis procedure a thesaurus lookup, and a loose phrase lookup of the type studied there. The entire document collection is passed through this lookup at