ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
A Modified Two-Level Search Algorithm Using Request Clustering
chapter
V. R. Lesser
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
Table 2: Average Correlations of First 25 Queries with Classification Vectors
Average Correlation [OCRerr]dth Query
Highest Correlating Second Highest Cor-
Case Set of Categories Classification Vector relating Classif. Vector
1. 8 Categories, Based on Clustering Documents J[OCRerr]5 .35
2. 8 Categories, Based on Clustering Random Queries .71 .36
3. 8 Categories, Based on Clustering 25 Queries .59 *)i[OCRerr]
4. 10 Categories, Based on Clustering Documents .45 .34
5. 10 Categories, Based on Clustering Random Queries .76 .43
6. 10 Categories, Based on Clustering 25 Queries .63 .43
Table 3: Revised Search Scheme Evaluation for Test Collection of First 25 Oueries
Based on Requests which Retrieved at Least6[OCRerr]Documents
CaBe Search Scheme
Set of Categories
MR [OCRerr]T RT
1. Noxlial T'[OCRerr]-Level 8 Categories, Based on Clustering Documents 23.19 .71 66
2. Modified [OCRerr]io-Level 8 Categories, Based on Clustering Random Queries 18.77 .74 .6o
3. Modified Two-Level 8 Categories, Based on Clustering 25 Queries 20.02 .73 .61
4. Normal Two-Level 10 Categories, Based on Clustering Documents 24.68 .68 .63
5. Modified Two-Level 10 Categories, Based on Clustering Random Queries 20.58 .76 .65
6. Modified T'[OCRerr]-Level 10 Categories, Based on Clustering 25 Queries 21.56 .74 .63