ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval The Indexing Function chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 2-4 There exist obvious problems to such extensions on the practical level. An important defec.t in such an index transformation lies in the fact that the structure of the index im[OCRerr] provides no facility for representi[OCRerr] the semantic associations which exist between distinct word types in the natural lan[OCRerr][OCRerr]e. One proposal for deali[OCRerr] with such associations on a statistical basis consists in assumi[OCRerr] that they can be derived a posteriori from a set of index im[OCRerr]s characterizi[OCRerr] a doc[OCRerr]iment collection in some given sub[OCRerr]ect' area. Thus one can assume, for example, that terms which co-occur in the sentences of a given document, or in the documents `of a given collection,[OCRerr]more frequently than the average are, in fact, semantically as well as statistically 10,11,12 relate[OCRerr]. In the' formal associative model, it is possible to account for key word associations of hi[OCRerr]her `order than the first and in addition to use these associations to influence query-document matching procedures. In s,uch a system, a document is repr'esented by its keyword set and additionally by the statistical properties of' the representations of all other documents in the collection. B. Semantic Techniques An important alternative to the statistical associative ,process consists in providing a specific semantic model in the index transformation directly. The indexing function may then be implemented by a thesaurus mapping containing a pre-def med set of semantic associations. A thesaurus transformation may be defined as a many to many mapping from recognizable word types or phrases to thesaurus