ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Information Analysis and Dictionary Construction chapter G. Salton M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. IV-29 more general ones, and to formulate a search request by starting with a [OCRerr]eneral formulation, and p[OCRerr]ogressively narrowing the specification do[OCRerr]m to those areas which appear to be of principal interest. Thus, one can start with a topic area such as t?[OCRerr]thematicsll, and from there proceed to [OCRerr] which is a subdi'.rision of mathematics, from where in ti[OCRerr]rn one can go to graph theorytt, which then leads to `1tree struc[OCRerr][OCRerr]rcs'1, from where final][OCRerr] one can obtain the s[OCRerr]rntactic dependency trees previously illustrated in Fi[OCRerr]. 7. In a content analysis system, a hierarchical arrangement of words or word stems can be used both for information identification and for retrieval purposes. Thus, if a given search request is formulated in terms of s[OCRerr]mtactic dependency trees!1, and it is found that not enough use[OCRerr][OCRerr] material is actually obtained., it is possible to 1expandt'. this request to include all tree structures or indeed all abstract graphs, by using a hierarchical subject classification. A hierarchy of concept numbers is included in the SMART system, and it is assumed[OCRerr]that a thesaurus look-up operation precedes any hierarchical expansion operation. A typical example [OCRerr]rom the SI[OCRerr]RT concept hierarchy is shovni in Fig. 8. The broad, more general concepts appear on the left side of the figure, corresponding to the "rootstt of the hierarchical tree; and the more specific concepts appear further to the right. For exL[OCRerr]ple, concept 270 is the root of a sub-tree, this concept has four sons on the next lower level, namely concepts 224, 471, 472, and 488. Concept 221 in turn has two sons, labelled 261 and 331; simllarly, concept 471 has four sons, including 338, 371, 458 and 470. It may be seen from Fig. 8, that the sons of a concept, representing more specific terms, are shown