IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Correlation Measures chapter K. Reitsma J. Sagalyn Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. IV-17 and all the request weights are adjusted so that the request space approxi- mately equals document space. (i.e. adjusting the request weights so that the average weight among the requests equals the average weight an[OCRerr]ng the documents.) [OCRerr]. Method of Evaluation The power of the various correlation coefficients is determined by the use of recall - precision plots. Recall is defined as the proportion of relevant documents retrieved, while precision is defined as the proportion of retrieved documents which are actually relevant. Recall = number of documents retrieved and relevant total number of relevant documents Precision = number of documents retrieved and relevant total number of documents retrieved For each of the queries, a recall - precision graph is produced. These are then averaged over all the queries. The method of averaging is as follows: 1) the peaks of each recall - precision graph are connected and the first peak is extrapolated horizontally to the y-axis (precision axis where recall equals zero); 2) the value of precision along this constructed line is thus determined at twenty different points along the recall axis, i.e. at recall equal to .05 ,.lO,.15,..., .95, 1.00; 3) for each of these points, the precision is averaged over all the queries; a final graph is plotted along these twenty average precision values. An averaged recall - precision graph is obtained in the above