ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
An Experimental Investigation of Automatic Hierarchy Generation
chapter
G. Blomgren
A. Goodman
L. Kelly
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
VIII -1
VIII An Experimental Investigation of
Automatic Hierarchy Generation
G. Blomgren, A. Goodman, and L. Kelly
Abstract
In automatic or semi-automatic document retrieval systems, a hierarchical
arrangement of concepts or terms affords modification of a query in three
[OCRerr][OCRerr]ys : generalization, specialization, or expansion with synonyms.
Hierarchies are u[OCRerr]ually constructed manually. A method for automatic
[OCRerr]eneration of hierarchies is proposed, and experimental results are presented.
1. Introduction
An automatic or semi-automatic document retrieval system usually
includes a thesaurus of concepts or terms which is used to expand queries.
For a given query, thesaurus entries which are similar to terms in the query
are added to its vector of terms. The search for relevant dcc[OCRerr]nnents then
continues with the expanded query [5,6,7].
Some systems, such as SM[OCRerr]RT at Harvard, employ a hierarchical arrange-
ment of concepts or terms to modify queries.[3] Such an arrangement connects
concepts by `tparent-sont1 and t1brother-brother1 relationships. A parent
concept is more general than its sons; brothers share an equivalent ranking.
Thus a query may be generalized by addin[OCRerr] to its vector of terms the
parents of those terms; contrariwise, a query may be specialized by adding
the sons of its terms. The addition of brothers represents inclusion of
similar terms. [5]