ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
The Query-Document Matching Function
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
4-14
stated in assumption 2, a classification category should be an e[OCRerr]uival-
ence class with respect to the retrieval function.
Let be a relation on the set of document images such that
if and only if every query which retrieves d. also retrieves d..
1
Assume then that the nature of the query-document matching function is
such that the relation [OCRerr] defined above is an equivalence relation
(reflexive, symmetric, and transitive) Further, let the equivalence
classes it induces on the document set be identified as classification
categories. Under these circumstances it is necessary only to match
an input query against a single member of each equivalence class, and
to retrieve all the members of those classes which satisfy the matching
function. In this manner, a reduced search retrieves exactly the ssme
document subset as a full seaz[OCRerr]ch.
The[OCRerr] only matching functions considered here, which result in
[OCRerr] being an equivalence relation are set equality for set represented
index images and vector equality for vector index images. The equival-
ence classes for these comparison operations, however,.c'resi[OCRerr]lt.[OCRerr]in;a
trival partition of the reference collection since each class is a
singleton. Thus the search on equivalence classes in this case is
identical to a full search over all reference documents.
Consider, for example, the set inclusion matching function for
which the retrieved set is defined by:
`[OCRerr] .`.[OCRerr]