ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Search Request Formulation
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
~.
[OCRerr]-42
C. Convergence
The performance improvement which results from a query
optimization produced by relevance feedback mbdification is a function
of the quality of the initial query, the degree of association of the
index images of the relevant documents, and the amount of feedback.
To investigate the influence of the latter parameters, some additional
experiments were conducted. Figure [OCRerr].14 shows the retrieval results as
a functIon of the amount of feedback for the query ??:[OCRerr]-Indexing?; and
Figure 5.15 for the query ?l[OCRerr][OCRerr] [OCRerr]atlaflgl?. The document-document
correlation matrix for those[OCRerr]documents relevant to the query "[OCRerr]-
Indexing11 are shown in Figure 5.16,. Thus the rapid improvement
obtained even with a small amount of feedback can be attributed to the
fact t'[OCRerr]hat the members of the relevant set are all closely associated.
In the case of the query ??N[OCRerr] [OCRerr]atlang1', this is not true, and the
document-document correlation of the first five relevant documents,
retrieved by the'original query indicates this. The relevance judgments
for this query were made. assuming a very general point of view. In
this case it might be of use to produce multiple modified queries by
seeking clusters in the relevant set. A possible partition based on
the document-document correlations is shown in Figure 5.16, and this
partition was used to generate two modified q[OCRerr]eries following equations
* (s.ii) and (5.12).' The retrieval results for these two modifications
are shown in Figure 5.17. This figure illustrates how each of these
* queries is useful in retrieving some relevant' documents. The fact that
some of the relevant documents have low correlations with both of the