ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval The Indexing Function chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 2-5 cate[OCRerr]ories or ooncept codes. Thus a set 0£ semantically associated natural lang[OCRerr][OCRerr]e terms comprised 0£ synonyms, £or example, can be mapped into a single element in the index langu[OCRerr]e; or a single natural langu[OCRerr]e term which has several connotations can be identified with a set 0£ elements in[OCRerr]the index langu[OCRerr]e (homonyms mi[OCRerr]ht be treated in this manner). Figure 2.1 provides an illustration by means 0£ an excerpt £rom the S[OCRerr]ART system thesaurus. The notion 0£ a semantically based transformation on a set 0£ reco[OCRerr]nizable (by machine) linguistic £eatures (word or stem types, phrases, etc.) can be generalized to include a variety 0£ the 13 associations which such elements possess. The index transformation may be described in this case by considering a multi-stage mapping. The £irst step consists in mapping the document into the set 0£ basic elements which describe it, e.g. into the set 0£ word types it contains. The second step is a transformation £rom these elements into a space 0£ synonymous term groups i.e. into thesaurus categories. (The thesaurus mapping described above consists in applying these two basic transformations.) Additional transformation stages may also be de£ined. Thus generic (inclusion). relations exist among semantic elements and these may be used to de£ine a set 0£ hierarchies. A number 0£ transformation can be de£ined based on a set 0£ such relations; thus a term which includes or. which is included by a given term may be[OCRerr] added to or may replace the related term in the document image. The index image 0£ a document, there£ore, can[OCRerr]be modified to contain terms which are generically related to those detected, but not explicitly present in the input text. Relation[OCRerr]s among index terms other than )