ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval Evaluation of Document Retrieval Systems chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 5-10 [OCRerr]LmL21 nil p = _____ (5.13) C m1 [OCRerr] Zn1 2 m r = 1L-1 n1 (5.14) C Zm n11 + i=1 [OCRerr]hese estimates may be interpret'e[OCRerr] as resulting from a composite contingency table [OCRerr]escription of the results of m retrieval operations in which the entries of the composite table are cumulations of the corr- espon[OCRerr]ing entries of the m indivi[OCRerr]ual tables. As such, these are vali& estimates for the con[OCRerr]itional population ratios, but not for the means of the associate[OCRerr] con[OCRerr]itional probabilities over the[OCRerr]query ssmple space. Without justification it can be assume[OCRerr] that a vali[OCRerr] measure of the performance of a retrieval system is, the average value receive[OCRerr] by the system1s users. Assuming that the precision an[OCRerr] recall con&itional probabili[OCRerr]ies which characterize a given retrieval operation are in fact indicitive of the value of that operation, the estimators &ef ine[OCRerr] by e[OCRerr]uations (5.ii) an[OCRerr] (5.12) are clearly the appropriate. ones. More precisely if it is assumed that the value of the ith of a set of m retrieval operations can be expressed as: i v = h(pi, r[OCRerr]) a random variable V is defined which is a function of the random varia- bles P and R. An estimate for the expectation of V, [OCRerr](V) is given by: