ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
A Modified Two-Level Search Algorithm Using Request Clustering
chapter
V. R. Lesser
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
VII-19
0.6 12
c=0.2 n=3
in steps in steps
of 3
of 0.1
o.6 12
[OCRerr] Kc,n
c=0.2 n=3
0.6 12 Kc,n
P = [OCRerr] P(qi.,c,n)
T c=0.2 fl=3 j=1 3
0.6 12
[OCRerr] Kc,n
c=0.2 n=3
12 Kc,n
[OCRerr] R(qi.,c,n)
n=3 j=1 3
0.6 12
[OCRerr] Kc,n
c=0.2 n=3
M(qi .,c,n)
3
0.6
R c=0.2
T
Kc,n
j=l
**
The values of [OCRerr] P[OCRerr] and R represent avera[OCRerr]e values for the criteria
I T
over the entire range of user needs. This provides a measure of search
effectiveness for a givcn search scheme, and a given set of categories
based on a test collection of queries. The values of M (co,no), P (co,no),
R (co,no) provide the same type of measure of search efficiency, except
that these measures are related to a particular user need (e.g. high
recall or high precision, etc.)
*
could be calculated in the following viay:
0.6 12 _
= [OCRerr] M (c,n)
c=0.2 n=3
20
However for a limited set of queries this method of calculating is not
valid since Kc, n for c, n large will be_very small, (i.e. covering not
many cases), and therefore the value of M (c, n) can fluctuate arbitrarily
for such a small sample, so that its value is not a good indicator of the
search effectiveness, and thus should not be given an equal weight in the
averaging procedure.
**
The averaging technique used to calculate these criteria is similar
to the procedure used to calculate ranked recall. [6]