ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Search Request Formulation
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
Since operationally it is expected that only a few iterations
½
Would ever be used, the differences between these alternative
formulations is not of major significance. If the user is satisfied
with the relevant documents identified by the previous iterations and
would, in effect, like to find others which are closely related to
these documents, queries produced by equatioxi ([OCRerr].15) would be more
suitable. If on the other hand, he is interested in maintaining a
broader search, the iterations produced by equation (3.14) will not be
as dependent on the relevant documents previously identified (members
T
of [OCRerr]
Average performance results for a second iteration of relevance
feedback produced by each of these alternatives are shown in Figure 3.12.
The results obtained with the original and first iteration queries are
included for comparison. As can be seen by these graphs, the results
obtained from using the iteration formula of equation (3.15) are
somewhat better than when the second iteration starts from the
original query. Rowever, in comparing the behavior of these
alternatives on individual queries, there are some cases in which the
reverse is true. Figure 3.18 illustrates an example of this. In this
case it is clear that documents 315 and 264 are not clustered in the
index space with the other relevant documents; and therefore, these
documents suffer more drastically from successive iterations (equation
(3.15) ) than from successive modifications to the original query
(equation (3.14) ).