ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval Evaluation of Document Retrieval Systems chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. I 5-6 an inadequate d&scription of the true situation. Users do not necessarily make binary relevance decisions nor are such decisions necessarily independent when examining a sequence of documents. In addition, query-document matching functions do not always lead to binary acceptance-rejection decisions; instead, they often result in the assignment of a coefficient of relevance or association between a 1 query-document pair as has been discussed in Chapter..4. Further, in many respects it may be more realistic to assume that the system's assessment of relevance should be interpreted on a relative rather than an absolute basis. Thus, a user is l+/e to examine at least a few of the highest assessed documents resulting from his search operation, independently bf the absolute retrieval. coefficients which are assigned to them. In this sense, there is a degree of difficulty in establishing a uniform criterion for what constitutes a positive relevance assessment by the retrieva[OCRerr] system over a sample set of search requests. [OCRerr]. Evaluation Statistics The contingency table description of a retrieval operation, shown in Fi[OCRerr]ure 5.1 (b) provides frequency ratio estimates of the joint probability distribution of the user/system decisions for the given query. One may then assume,[OCRerr]Jother variables remaining constant, that these frequency ratios converge to probabilities as the number of documents searched (N) increases.' Alternatively, one may assume that the probability estimates obtained by a search over N documents predict the be'ha[OCRerr]ur of the system with respect to the input query