ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval The Query-Document Matching Function chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 4-1d~ index images as for query images, document-document distance is defined and possesses the same properties of query-document distance. [OCRerr] virtue of metric property (iii), the triangle inequality, a search request which is close (related) to a given document d must necessarily be close to all documents which are themselves close to d. Let a set of documents D , grouped as a classi£ication c category, be confined to a region of the index space such that: [OCRerr](d[OCRerr],c)< for all d. E D 1 c, where c is an arbitrary vector in this region. query q to the vector c be [OCRerr](q,c) [OCRerr] Let the distance from a The metric properties of the distance function allow( the distance between q and the members of I to be bounded as follows: c max [OCRerr] [OCRerr]o+ [OCRerr]` for all diEDc[OCRerr] Thus the[OCRerr]single distancej(q,c) provides a bound on the set of distances from q to the members of the document set I . The c following discussion is limited to the vector indexing model and angular distance matching with the understanding that it is generally applicable to any system employing a metric sImilarity measure. In the vector model, document or query.index images are treated