ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval The Indexing Function chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. j i 2-8 applicable to both statistical and semantic processing since the meaning 0£ a word is [OCRerr]enerally invariant over its morphological variants. Much more ambitious syntactic processing procedures are also under * investigation, including the use 0£ £ully automatic syntactic analysis. A £ull sentence by sentence syntactic analysis could, £or example, provide explicit dependency relations among the various semantic elements 0£ a sentence, and could be us[OCRerr]ed £or phrase recogaition or £or the reco[OCRerr]rijtion 0£ structurally constrained associations among semantic terms. At thepresent time it[OCRerr]is not clear whether the complexity [OCRerr]e[OCRerr]uired £or the reco[OCRerr]riition 0£ complex structural constraints is justi£ied in terms 0£ the additional in£ormation extracted thereby. 4. The Structure 0£ Index Representations The index trans£ormation represents a mapping £rom the natural langua[OCRerr] 0£ the source text to the tar[OCRerr]t or index la[OCRerr][OCRerr]e. The index ima[OCRerr]e 0£ a source document is thus a representation 0£ the content 0£ * the document[OCRerr]in this target langnagee[OCRerr] The most commonly used index * lai[OCRerr]ge structure is the description list, or property vector, in which the indez image consists 0£ a list 0£ those properties 0£ a £inite set which characterize'the document. Index images 0£ this type are used, £or example, in Uniterm systems[OCRerr]where the document representation is an unordered set of keywords (descriptors,,uniterms, etc.). 1£ the property set is[OCRerr]ordered, £or example, by a 1 to 1 mapping to the set 0£ * integers, the index image may be encoded as a binary vector. A more general. representation 0£ the same type allows £or a quantization 0£ the