ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Relevance Feedback in an Information Retrieval System chapter W. Riddle T. Horwitz R. Dietz Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. VI-12 a, the query q' is obtained and all the relevant documents are retrieved. The final values of recall and precision depend on the number of relevant documents retrieved on the successive searches, since more information will obviously perturb the query to a greater extent. In particular, there is a dependence on the nun[OCRerr]ber of relevant documents retrieved initially, which is, in turn, dependent on the correlation function used. (in this investigation, the dependence is actually on the denominator of the correlation formula, since all of the functions tested possess the same numerator.) If only a few of the relevant documents are retrieved initially, then convergence is slow. In other words, given a query having three relevant documents, the probability of retrieving all three is higher if two of the documents are retrieved initially rather than only one. As shown in Figure ii[OCRerr], for query [OCRerr]Al5 the cosine correla- tion function initially retrieves three relevant documents, while the co- occurrence and simple vector matching correlation functions retrieve two and four respectively. Since the simple vector matching case now includes more information concerning the concepts in the relevant documents, the final values of recall and precision achieved by the modification process are higher when simple vector matching is used as the correlation function, than when either of the other two functions is used. These results suggest that it is unwise to restrict the proposed retrieval system to the use of a single correlation function. 1+. Conclusions The implicit assumption underlying this investigation is that relevance feedback is a necessary part of the overall retrieval process. As the