ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval Evaluation of Document Retrieval Systems chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 5-8 C Pr [OCRerr][OCRerr]1= [OCRerr] [OCRerr]2= [OCRerr]2.' P3 P3 , F p[OCRerr] = f(p,p `[OCRerr] k'[OCRerr]4l[OCRerr] 1 3 k 4= [OCRerr] 11.2 3 (i, j , k, 1 = 1,2,..) The set of m 4-tuples which result from the test set of retrieval oper- ations provides then an estimate for this joint distribution. The fact that the random variables assume values which are probabilities (or more precisely estimates of probabilities ) represents only a notational difficulty Statistically then, the objective of an evaluation experiment is to estimate this joint probability distribution or some parameters which characterize it. Clearly any evaluation which ignores the essen- - tial fact that the system performance is a random variable defined over the query sample space can produce misleading results. Consider for example the evaluation data produced by the Cranfield studies.6'7'8 System evaluations in these reports were presented primarily in termsof the two conditional probabilities, precision and recall, rather than in terms of the joint probabilities This in itself introduces no problem (other than[OCRerr]the[OCRerr]fact that it does not represent all the information available in the experimental data); the method used to compute estimates for the mean values of the precision and recall proba- bilities, however, was in error. The precision and recall conditional probabilities, being func- tions of the random variables [OCRerr]1' are themselves random variables defined on the query sample space. The results of m.retrieval operations may be summarized in terms of these conditional probabilities by m couples: ½