ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Relevance Feedback in an Information Retrieval System chapter W. Riddle T. Horwitz R. Dietz Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. v'-6 Co-occurrence correlation function: [6] m [OCRerr] ) m m ( [OCRerr] d[OCRerr]d[OCRerr] ) - q.d. i=l 1=1 1 1 * Si!r[OCRerr]le vector matching correlation function: m ( [OCRerr] [OCRerr] ) i=l m where m = the number of indexing concepts the ith concept weight of the query vector d.= the ith concept weight of the document vector 1 The effect of these different correlation functions on the relevance feedback process is not known, so the correlation function is included as another parameter in this investigation. a) Determination of the Relevance Weighting Factors In determining the relevance weighting factors the assumption is made that no information concerning the relative ranking of the relevant docu- ments is available. That is, there is no way of knowing if one relevant document is more r[OCRerr]evant than another. This is consistent with the proposed information retrieval system, in which the user returns only 11relevant1T or `1non-relevant" judg[OCRerr]ients, without indicating the degree of relevance of each document retrieved. This implies that the numerical interpretation of the relevance information should be binary; therefore a weight of 1 is used as the relevance weighting factor of a relevant document * The sin[OCRerr]le vector matching correlation function, as stated, is strictly suitable for use only with binary document and query vectors. Its use with other than binary vectors, without the addition of a normalization factor, does, however, preserve the relative rankings of the documents.