IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Evaluation Parameters
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
11-7
ideal positions resulting from a perfect [OCRerr]ystem. Results presented in other
sections of this report employ the two normalized measures, so the formulas
are repeated for convenience:
Normalized Recall
[OCRerr]43ri-[OCRerr]i
n(N - n)
n
L[OCRerr]=1 log r[OCRerr] ___ log i
- 1 - N.'
log (N-n).' n.
where n = number of relevant documents
N = number of documents in collection
th
r = rank of i relevant document
th
i = ideal rank positions for the i relevant iteme
The result obtained from one individual search request is given in Figure 3,
and both the normalized measures are computed. Normalized recall gives equal
`weight' to documents with high rank positions as to documents with low rank
positions, but normalized precision gives stronger weight to the initial
section of the retrieval list, that is, to those with high rank positions.
An attempt to derive a single number measure of a quite different
type is reported by John Swets [3]. It is different from the measures used
by SMART since it does not directly use the ranked output list, but uses
in the first place performance curves similar to those discussed in the next
sub-section; examination of this measure is thus deferred. The T1normalized
`sliding ratio' measure" proposed by Giuliano and Jones [8] appears to be
designed for use at one selected cut-off point, and so again differs from
the SMART measures.
Normalized Precision
B) Varying Cut-off Performance Curves
The most conmion measures of retrieval performance are the precision
and recall ratios derived from the retrieval table, and given in Figure 1.