ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval The Indexing Function chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 2-9 value, or degree, to which each attribute pertains to the document by associati[OCRerr] a scalar with each attribute. In this case the index images can be encoded as numeric r&ther than binary description vectors.in the attribute space. Table 2.1 illustrates a typical 16 [OCRerr]eyword description derived by statistical analysis (from Booth ) in which the relative frequency of the 15 most fre[OCRerr]uent non-common word stem types from a sample document are shown. This analysis can be used to establis[OCRerr] a property set index image' (by employi[OCRerr] a frequency sensitive selection procedure), a binary description vector, or a numeric description vector incorporating relative frequency information[OCRerr]. Symbolic examples of each of these are illustrated in Figure 2.2. A property list description does not allow for a direct representation of any relations among the various attributes, unless these are specifically identified in the attribute space. Since information in the natural langna[OCRerr]e is conveyed by semantic referents (words, phrases)'and by the relations indicated among the referents (syntax and context), index langnages capable of explicitly representing relations among attribute's have been investigated. A 17 variety of such structures'have' been studied,,. `including.. tre., e.and'. graph representations.' A syntactic dependency tree, for example, can represent a naturaLlangua[OCRerr]e sentence by associating its nodes with the semantic'values of the words they represent, and its branches - 18 with direct syntactic dependency. An example (from Sussengnth ) is illustrated in Figure 2.3. While such'index structures `are capable of more precise modeling of the inf'or[OCRerr][OCRerr][OCRerr]ation carying elements of the