ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
A Modified Two-Level Search Algorithm Using Request Clustering
chapter
V. R. Lesser
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
VII-7
[OCRerr]. Desib[OCRerr] of an Experiment to Compare the Modified with the
Normal i'[OCRerr]To-Level Search Scheme
A) Problem Areas
In structuring this experiment, the following questions zinist be
answered:
1) what criteria can be used to judge the relative merits of
each procedure ?
2) How should these alternative search procedures be implemented,
and what parameters [OCRerr][OCRerr]st be adjusted in using these procedures ?
3) what type of document and query collection will serve as an
adequate data base in order to obtain valid conclusions ?
B) Tests to Compare the Effectiveness of Each Search Procedure
The main criterion for effectiveness is based on the number of
documents which must be scanned in each procedure in order to obtain most
of the relevant documents for each query. In a practical implementation
of a normal two level search scheme, the number of subsets completely
searched [OCRerr][OCRerr]ll be either a fixed number for all queries, or will depend on
the correlations with the query of the centroid vectors of the document
subsets. Neither of these procedures for determinin& the number of
cate[OCRerr]ories to be searched completely can be used to compare the
effectiveness of the modified two-level search scheme with the normal two-
level search, since neither the number of clusters nor the size of each
cluster is the same for both search schemes. These differences in the
number of clusters, and the size of each cluster make it impossible to
use the same parameters for determining the number of subsets of documents
to be completely searched for each of the search procedures. Further,