ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval A Modified Two-Level Search Algorithm Using Request Clustering chapter V. R. Lesser Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. VII-12 3) the set of non-associated documents is partitioned using the standard clustering algorithm, and all loose documents are associated with the nearest partition; this guarantees that every document is included in at least one category; the clusters of documents should be constructed in a sfriilar manner as the cluster of documents used for the two-level search scheme. In the experimental program, the emphasis has been placed on the various parameters which need to be adjusted since it is necessary in order to validly compare the alternative search procedures either to choose the set of parameters associated with each search scheme so as to maximize effectiveness of the search scheme for the test data base, or to define rules by which it is possible to calculate the value of each parameter for any data base. D) Test Data Base The following requirements must be met for the document and query collection to be used to evaluate the effectiveness of the modified versus the normal two-level search: 1) the collection of queries should be real user requests obtained from an actual document retrieval system; 2) the collection of queries should be large enough so that information dense subsets can exist among the queries; 3) relevance judgments should exist for at least a part of the query collection (this provides a control sample of queries which allows the testing of the modified versus normal two-level search scheme for retrieval of relevant documents); [OCRerr]) the document collection should contain dense areas of information; othe[OCRerr]dse, the normal two-level search scheme cannot be efficiently implemented.