ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval An Experimental Investigation of Automatic Hierarchy Generation chapter G. Blomgren A. Goodman L. Kelly Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. VIII -1 VIII An Experimental Investigation of Automatic Hierarchy Generation G. Blomgren, A. Goodman, and L. Kelly Abstract In automatic or semi-automatic document retrieval systems, a hierarchical arrangement of concepts or terms affords modification of a query in three [OCRerr][OCRerr]ys : generalization, specialization, or expansion with synonyms. Hierarchies are u[OCRerr]ually constructed manually. A method for automatic [OCRerr]eneration of hierarchies is proposed, and experimental results are presented. 1. Introduction An automatic or semi-automatic document retrieval system usually includes a thesaurus of concepts or terms which is used to expand queries. For a given query, thesaurus entries which are similar to terms in the query are added to its vector of terms. The search for relevant dcc[OCRerr]nnents then continues with the expanded query [5,6,7]. Some systems, such as SM[OCRerr]RT at Harvard, employ a hierarchical arrange- ment of concepts or terms to modify queries.[3] Such an arrangement connects concepts by `tparent-sont1 and t1brother-brother1 relationships. A parent concept is more general than its sons; brothers share an equivalent ranking. Thus a query may be generalized by addin[OCRerr] to its vector of terms the parents of those terms; contrariwise, a query may be specialized by adding the sons of its terms. The addition of brothers represents inclusion of similar terms. [5]