ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Information Analysis and Dictionary Construction chapter G. Salton M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. IV-[OCRerr]7 dictionary. Such a committee produced standard frequently ends by satisfying no one, despite the enormous effort which goes into its con- struction. Clearly, if it were necessary to follow this particular pattern in order to build a useful dictionary for retrieval purposes, then any saving which might result from automatic search and retrieval methodology would promptly be lost through the elaborate preparations required to build dictionaries. This situation has led to many efforts calculated to produce dictionaries either fully-automatically, or in any case by more systematic procedures than a committee-controlled process. Any reasonably standardized method for dictionary construction not only saves time and decreases costs, but also permits a great deal more latitude in the type of retrieval procedures which can be implemented. The following principal advantages are evident: 1) the retrieval procedures can be extended to collections in many different areas, since the dictionary problem no longer consti- tutes an impediment; 2) it becomes possible to investigate differences in vocabulary between different subject areas, notably the frequently heard assertion that the vocabulary in some subject areas is It50fttl (that is, not well standardized and ambiguous), whereas in other areas it is I[OCRerr]hard?v; 3) it removes any possible differences in retrieval effectiveness between different subject areas due to disturbances introduced by varying methods of thesaurus construction; 4) it becomes possible to investigate the retrieval effectiveness of a variety of thesauruses for a given collection, including variations in the thesaurus size, in the number of concept classes, and in the correspondents assigned to each class.