ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Design Criteria for Automatic Information Systems
chapter
M. E. Lesk
G. Salton
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
v-28
E) Hierarchical Subject Expansion
Hierarchical arrangements of information identifiers, similar in
construction to library classification schedules make it possible,
given an entry, to find more general terms by going `up' in the
hierarchy (expansion by parents), and more specific ones by going "down"
(expansion by sons). The hierarchies provided for the SM[OCRerr][OCRerr]T system include,
in addition, expansions by "brothers" on the same level as the original
terms, and expansions by adding certain "cross-references". Dozens of
different hierarchy options can be used, of which two are shown in Fig.
13.
Fig. 13(a) shows an expansion by adding for each original term its
parent in the hierarchy, the expansion being applied to both documents
and requests. Clearly, this option does not on the average provide an
improvement over the standard t'Harris Three" thesaurus process. On the
other hand, an expansion by "sons1t applied to requests only (and not to
the documents) seems to offer some improvement in performance for the
middle ranges of recall and precision.
In general, hierarchical subject expansions result in large-scale
disturbances in the information identifiers attached to documents and
search requests. Occasionally, such a disturbance can serve to crystallize
the meaning of a poorly stated request, particularly if the request is far
removed from the principal subjects covered by the document collection.
More often, the change in direction specified by the hierarchy option is
too violent, and the average performance of most hierarchy procedures does
not appear to be sufficiently promising to advocate their [OCRerr]ncorporation in
an analysis system for automatic document retrieval.