IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Correlation Measures
chapter
K. Reitsma
J. Sagalyn
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
IV-17
and all the request weights are adjusted so that the request space approxi-
mately equals document space. (i.e. adjusting the request weights so that
the average weight among the requests equals the average weight an[OCRerr]ng the
documents.)
[OCRerr]. Method of Evaluation
The power of the various correlation coefficients is determined
by the use of recall - precision plots. Recall is defined as the proportion
of relevant documents retrieved, while precision is defined as the proportion
of retrieved documents which are actually relevant.
Recall = number of documents retrieved and relevant
total number of relevant documents
Precision =
number of documents retrieved and relevant
total number of documents retrieved
For each of the queries, a recall - precision graph is produced.
These are then averaged over all the queries. The method of averaging is
as follows:
1) the peaks of each recall - precision graph are connected and
the first peak is extrapolated horizontally to the y-axis
(precision axis where recall equals zero);
2) the value of precision along this constructed line is thus
determined at twenty different points along the recall axis,
i.e. at recall equal to .05 ,.lO,.15,..., .95, 1.00;
3) for each of these points, the precision is averaged over all
the queries;
a final graph is plotted along these twenty average precision
values.
An averaged recall - precision graph is obtained in the above