ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval On Some Clustering Techniques for Information Retrieval chapter J. D. Broffitt H. L. Morgan J. V. Soden Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. No. Procedure Cutoff No. of Mean Mean No. of Simthres Clusters No.of Mean Mean Mea Mean 2 1 t n2 Documents. Recall Precision Recall Precision MELtches Retrieved 1 [OCRerr]Onner (2) Cosine .4o .6o 4[OCRerr] 52.6 7.6 .43 .26 .70 .70 2 Bonner (2) Tanimoto .25 .60 51 59.2 8.2 .36 .22 .63 .70 3 Bonner (2) Tanimoto .25 .4o 49 57.4 8.4 .34 .20 .63 .63 4 Bonner (2) Tanimoto .22 .60 40 52.0 ii.4 .36 .19 .73 .56 5 Rocchio(l) _____________ 8 i8.4 io.4 .28 .10 .49 .36 6 Rocchio(2) __________ 8 29.7 20.8 .4i .06 .76 .30 7 Rocchio(l) __________ 10 23.5 13.5 .43 .11 .59 .37 8 Rocchio(2) _________________________ 10 35.8 25.0 .57 .08 .88 .27 [OCRerr]easure based on [OCRerr]ual relevance judgments. 2 Measure based on "automatic[OCRerr]' relevance judgments. (1): Documents in highest correlated cluster only are retrieved. (2): Documents in two highest correlating clusters are retrieved. Summary of Results Using 82 Document ADI Collection with 20 Queries Table 1