ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Relevance Feedback in an Information Retrieval System
chapter
W. Riddle
T. Horwitz
R. Dietz
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
VI-5
queries. The results of these sample runs, presented in the following
paragraphs, are used to develop strategies which are then applied to the
entire set of queries.
A) Determination of the [OCRerr]umber of Documents Retrieved
The number of documents, n, that are returned to the user is set at
fifteen. The saznple runs show that if this number is reduced to eight,
the effectiveness of the updating process is dIminished. In the case cited
in Figure 1, returning fifteen documents leads to the retrieval of four
relevant documents after three modifications are made, while returning only
eight documents leads to the final retrieval of only two relevant documents
after the same number of modifications. This implies the need to return
initially as many relevant documents as possible so that more information
can be used in the updating procedure. (The number of relevant documents
initially retrieved also depends on the correlationfunction, as is discussed
later.) Further, in determining the number of documents to be retrieved, a
co[OCRerr]promise must be made between the desirability of retrieving a large number
of documents and the desirability of not imposing a large reading task on the
user.
B) The Effect of the Correlation Function
The result of an iteration is a list of n documents ranked by their
correlations with the query. These correlations are determined by one of
the following correlation functions:
Cosine correlation function:£7]
m
i[OCRerr]l (q[OCRerr]d[OCRerr])
\/ ([OCRerr]mz=1[OCRerr][OCRerr][OCRerr] ) x ([OCRerr]m[OCRerr]1didi