ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
On Some Clustering Techniques for Information Retrieval
chapter
J. D. Broffitt
H. L. Morgan
J. V. Soden
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
No. Procedure Cutoff
No. of Mean Mean No. of
Simthres Clusters No.of
Mean Mean Mea Mean 2
1 t n2
Documents. Recall Precision Recall Precision
MELtches Retrieved
1 [OCRerr]Onner (2)
Cosine .4o .6o 4[OCRerr] 52.6 7.6 .43 .26 .70 .70
2 Bonner (2)
Tanimoto .25 .60 51 59.2 8.2 .36 .22 .63 .70
3 Bonner (2)
Tanimoto .25 .4o 49 57.4 8.4 .34 .20 .63 .63
4 Bonner (2)
Tanimoto .22 .60 40 52.0 ii.4 .36 .19 .73 .56
5 Rocchio(l)
_____________ 8 i8.4 io.4 .28 .10 .49 .36
6 Rocchio(2)
__________ 8 29.7 20.8 .4i .06 .76 .30
7 Rocchio(l)
__________ 10 23.5 13.5 .43 .11 .59 .37
8 Rocchio(2)
_________________________ 10 35.8 25.0 .57 .08 .88 .27
[OCRerr]easure based on [OCRerr]ual relevance judgments.
2
Measure based on "automatic[OCRerr]' relevance judgments.
(1): Documents in highest correlated cluster only are retrieved.
(2): Documents in two highest correlating clusters are retrieved.
Summary of Results Using 82 Document ADI Collection with 20 Queries
Table 1