MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Other Potentially Related Research chapter Mary Elizabeth Stevens National Bureau of Standards in identical terms and not in synonymous ones. If the existence 0£ synonyms is avoided, by using a small number 0£ exclusive descriptors, the description 0£ a document in terms useful for retrieval is more difficult, also it is equally difficult to relate a request to the description of documents. A further difficulty is that descriptions only list the main terms, and take no account of their relations to one another. The C. L. R. U. experiments being carried out make use of a thesaurus, a procedure through which it is hoped that these difficulties will be avoided and that a request for a document although not using the same terms as those in the document will produce that document and others dealing with the same problem, but described in different, though synonymous, terms." In general, the use of a thesaurus to constrain variations in word or term usage (as in our first definition, a mechanized authority list), to reduce synonymity, to resolve homographic ambiguity, to provoke and suggest additional terms or ideas to indexer and to searcher alike, is related to the improvement of automatic indexing proced[OCRerr]res in precisely the same sense that its use would be effective in any indexing system whatso- ever. In another sense, however, the construction and use of the thesaurus is related to linguistic data proc[OCRerr]ssing by machine in another way. Garvin suggests: ..... One may reasonably expect to arrive at a semantic classification of the content- bearing elements of a language which is inductively inferred from the study of text, rather than superimposed from some viewpoint external to the structure of the language. Such a classification can be expected to yield more reliable answers to the problems of synonymy and content representation than the existing thesauri and synonym lists, which are based mainly on intuitively perceived similarities without adequate empirical controls." 2/ This is with respect to the recognition that the machine itself can be used to compile and construct the thesaurus. While Luhn in some of his 1957-8 proposals still considered the compilation and organization of a thesaurus to be primarily a matter of human effort, he nevertheless pointed out that: "The statistical material that may be required in the manual compilation of dictionaries and thesauri may be derived from the original texts in any desired form and degree of detail." De Grolier makes the complementary statement that the Luhn techniques should "considerably facilitate" the preparation of thesauri. 4/ Even more importantly, the computer can be used for periodic up-datings and revisions. The work on the FASEB index-term normalization procedures involved early recognition of the need to "educate the thesaurus" by examining print-outs when no matches occurred and providing a continuous process of amendment. [OCRerr]/ Computer- maintained statistics of word and term usages are closely related to possibilities for 1/ 2I 3/ 4/ 5' Masterman, Needham, and Sparck-Jones, 1958 [405], p. 934-935; Needham and Joyce 1958 [305]. Garvin, 1961 [224], p. 138. Luhn, 1959[354], p. 12. De Grolier, 1962 [152], p. 132. Shepherd, 1963 [545], p. 392. 117