ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval A Modified Two-Level Search Algorithm Using Request Clustering chapter V. R. Lesser Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. VII-3 It is felt that the partitioning of the document collection by grouping documents containing similar information identifiers does not always maximize the efficiency of the multi-level search. This technique of partitioning is effective when the set of queries introduced into the system can be divided into groups of queries which roughly correspond in information content to the subsets of documents previously created by the clustering algorithm. If this is not the case, the set of relevant documents for a query will be spread over many document subsets, and the multi-level search will not prove effective In practice, it is believed that the distribution of the information content of the queries may often differ si[OCRerr]nificantly from that of the document collection. [OCRerr]`urthermore, if this contention is correct, a more efficient classification scheme can be constructed by considering the information content of queries previously introduced into the system. In the next few paragraphs, new techniques are described for partitio- ning the document collection, and for carrying out the multi-level search, in accordance with the query set previously introduced into the system, as well as a possible modification of this technique of partitioning based on relevance judgments provided by the user. 2. A Modified Clustering Algorithm and a Corresponding Twd-Level Search Strategy It is desired to construct clusters of documents as a function of both the collection of documents, and also the collection of previous queries introduced into the system. The procedure for clustering is divided into three stages: