ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval The Query-Document Matching Function chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 4-16 employed in the index space. Under these circumstances there always exists some query q0 such that (q0,d1) = and [OCRerr](q0,d2) [OCRerr] +*[OCRerr] so that regardless of how close two document images are, they do not belong to an equivalence class with respect to retrieval unless they are in fact identical. Under these circumstances it is clear that in order to reduce the number of comparisons required in a retrieval operation, it will be necessary to introduce some finite probability of error. Thus, since the classification categories cannot be identified with equivalence classes under matching functions of interest, a limited search strategy may fail to retrieve some documents which would be retrieved by a full search over the entire collection. The design of a classification system, then, must involve a tradeoff between the total number of comparisons. (search efficiency) and the probability of loss of relevant documents (versus retrieval by a full search). 4. Classification and [OCRerr]etric Searching The two previously considered metric query-document matching functions did not lead t&an equivalence class partition of the reference collection. Metric comparison measures do, however, have a