ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval Introduction chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 1-8 thereo£, a matching £unction is desired which is independent 0£ the vector ma[OCRerr]nitudes involved. Under these circumstances it is natural to assume that the in£ormation carried by the index vector is contained in its angular position (i.e. its orientation in the property space). The matching £unction assumed, there£ore, is the angular distance or a monotonic £unction 0£ this distance between the search request vector and the source document vector, wherein decreasing distance is assumed to indicate increasing probability 0£ relevance. I). Terminolo[OCRerr] In dealing with the £ore[OCRerr]oing model, the £ollowing de£initions are required: 1) Let[OCRerr]= [OCRerr] represent the set 0£ source documents in'the natural langu[OCRerr] comprising the re£erence collection. 2) [OCRerr] represent a set 0£ sample search requests in the natural langu[OCRerr]e comprising a test set 0£ retrieval queries. 3) Let T represent the index trans£ormation £rom the natural language to the index language. The index image 0£ a document [OCRerr] d[OCRerr]=T(I)[OCRerr])[OCRerr] and the index image 0£ a search request [OCRerr]. E [OCRerr] is q.=T([OCRerr].). [OCRerr][OCRerr]urther let l--'{d ,d[OCRerr], 1 1 1 2 where alternative models are considered, e.g. lndex images represented[OCRerr]by `sets rather than vectors, the required notation will,be introduced following the framework defined here.