ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Search Request Formulation
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
A
operation be a list in correlation order of the documents whose images
are most closely related to q0. [OCRerr] user examines this list and
specifies which of the documents in it are relevant and which are not.
Since the modification is to be based only on a sample of the relevant
documents (assuming that some are missing from the retrieved list
associated with q ), the modified request will be formed by adding to
0
the original query, q0, an optimal query vector based on the feedback
information. [OCRerr]he resultant[OCRerr]vector (the new query) should thus be a
better approximation to the optimal query than q[OCRerr]0, and should,
therefore, produce better retrieval when resubmitted.
I{ence we seek a relation of the form:
q1 = f(q0,[OCRerr],S)
where q0 is the original query, [OCRerr] is the subset of the retrieved set
which the user deems relevant, and S is the subset of the retrieved
set (based on q[OCRerr]0) which the user deems nonrelevant. The form suggested
immediately by the &oove is:
ni n2
=c(1q0+c(2 L
riT12Z (3.8)
1
niL
i 1=1
where n1 = n([OCRerr]), n2 = n(S), [OCRerr] =[OCRerr]rž,r2.....r[OCRerr]n1[OCRerr], S =[OCRerr][OCRerr]i' [OCRerr]2.......5n>'
where all vectors have been normalized to unit length, and X1 and Oc
2
are arbitrary weighting coefficients.