ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Information Analysis and Dictionary Construction
chapter
G. Salton
M. E. Lesk
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
IV- 30
below their parents and further to the right.
The hierarchy of Ffig. 3 also provides for the inclusion of cross
references from one concept to another, which are coi'[OCRerr]ected to the original
concept by broken lines. Such cross references represent general, unspeci-
fied types of relations between the corresponding concepts, and receive in
general a different interpretation than the generic inclusion relations
normaliy represented by the hierarchy.
It would be nice if it were possible to give some generally applicable
aigorith[OCRerr] for constructing hierarchical zub[OCRerr]ect arrangements. This is, in
fact, a topic which has preoccupied many people including mathematicians,
philosophers, and librarians for many years. In ge[OCRerr]eral, one can sa[OCRerr] that
broad concepts should be near the top of tree, whereas specific concepts
should be near the bottom; furthermore there appears to be some relation-
ship between the frequency of occurrence of a given concept in a document
collection, and its place in the hierarchy. More specifically those concepts
which exhibit the highest frequency of occurrence in a given document
collection, and which by this very fact appear to be reasonably common,
should be placed on a hi-her level than other concepts whose frequency of
occurrence is lower.
Concerning the specific place of a given concept within the hierarchy,
this should be r[OCRerr]de to depend on the user population and on the type of
expansion which is most often requested. Thus, a concept corresponding to
1tsyntactic dependency tree1 would most reasonably a[OCRerr][OCRerr]ear under the broader
category of 11syntax1t, which in turn could appear under the general class
of ttlanguage't, assuming that the user population consists of linguists
or grammarians; on the other hand, if the users were to be mathematicians
or algebraists, then the t1syntactic dependency trees[OCRerr] should probably appear