ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Search Request Formulation
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
5-15
query vector q1. The table compares the correlations of q0 and q1
with the document vectors.
The modifications to an initial query vector which are
produced by the relevance feedback algorithm may receive the following
interpretation: concepts, i.e. components of the initial query which
are more significant in the document images of the relevant subset
than in the nonrelevant' subset will be emphasized (i.e. increased in
weight and visa-versa). Thus the weighting of the original query
terms, derived from frequency counting, will be adjusted on the basis
of the statistical evidence derived from the sample output for which
the user provides relevance feedback. In addition, concepts not
inc,luded in' the original query but which are also useful in
differentiating the relevant from the nonrelevant documents will be
added to the modified query image. Such concepts (components of the
index space) can be' expected to be useful in retrieving other
relevant documents not explicitly identified by the original query,
since all' relevant documents (which can be successfully retrieved)
must be sufficiently related to' be localized in some region of the
index space.
The ba;sic relation for request modification using relevance
feedback (eq'uation (5.8) ) can be modified in various ways by
`imposing additional constraints.' For' example, the weighting of the
original query could be a `function of the amount of feedback such that
with large amounts of feedback, `the original query has less effect on
the resultant than with small amoun'ts'of feedback. Another constraint,