IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Thesaurus, Phrase and Hierarchy Dictionaries chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. VII-l VII. Thesaurus, Phrase and Hierarchy Dictionaries E. M. Keen 1. Introduction The suffix removal procedures described in Section VI provide synonym control only when identical word stems are involved; any compre- hensive synonym and partial synonym recognition requires a procedure that groups words according to synonymy irrespective of word spelling. For this reason, the use of dictionaries of the thesaurus type is being investigated, as well as the use of phrases rather than single words, and also the use of word relations as specified by hierarchical arrangements. The construc- tion characteristics of several dictionaries are discussed in the present section, before retrieval runs are presented, using retrieval results for three document collections. 2. Description of Thesaurus Dictionaries Seven thesaurus dictionaries are currently available, and each is referred to as follows: 1. I[OCRerr]-3 Thesaurus-2. Known also as the "Harris 2" thesaurus, this handmade dictionary was originally constructed for use specifically with the IRE-l collection. 2. IRE-3 Thesaurus-3. Known also as the "Harris 3" thesaurus, this handmade dictionary was constructed for use with any collection of computer science documents, and was first tested on the IRE-2 collection. 3. CRAN-l Thesaurus-l[OCRerr] Known also as the "Old Quasi-Synonym" dictionary, this is a modified manually-constructed version of