ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Search Request Formulation
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
3-10
of R and the mean of its correlations with the members of S. Second,
it can easily be shown from the definition of the vector dot product
that C is maximized by the vector q'= q1 subject to the condition that
0
the components of q' be nonnegative. [OCRerr]he components of q? are given by:
0
q0[OCRerr]
0.
= L 0
if q >0
0.
= 1,N)
if q <0
C)
(3.7)
ilence, under the assuniptions made, an unambignous optimal
(for the criteria stated) query image exists corresponding to any non-
empty subset of D. Further, the equation 3.5 provides an effective
means of generating such a query from knowledge of the relevant subset
[OCRerr] In the evaluation of information retrieval systems and in
particular in the evaluation of the indexing function of such systems,
this `formulation of an optimal search request provides the ability to
isolate' the effects of ind'exing from variances due to request
formulation. An optimal search request measures the ability of the
index transformation to, differentiate a particular set of documents from
all the others of a collection. In an evaluation situation, where one
assumes prior kn[OCRerr]wledge of the document subset relevant to each test
query, the retrieval performance of the optimal query corresponding to
the relevant subset provides a direct measure ,of the ability of the
system to extract from the index representations of documents the same
kind of `information the user can' extract from, the natural lang[OCRerr]age.