ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval A Modified Two-Level Search Algorithm Using Request Clustering chapter V. R. Lesser Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. vii-4 1) the collection of previous queries introduced into the system is partitioned into subsets of queries using a standard clustering algorithm; [1,6] 2) an associated subset of documents is formed for each subset of queries constructed in step 1); the associated subset of documents consists of all documents which are highly correlated with at least one query contained in the subset of queries; 3) all documents which are not associated with any query cluster by step 2) are divided into subsets using a standard clustering algorithm. The multi-level search previously described is then modified to take into account this new request clustering procedure. The new modified two- level search algorithm uses the following procedure: the new query is correlated against the centroid vectors of the cluster subsets of previous queries; if the new query correlates highly with at least one of the query centroid vectors, the query is matched against each document contained in the associated subsets of documents corresponding to each highly correlated query centroid vector; otherwise, the new query is matched against the centroid vectors of the subsets of non-associated documents constructed in step 3); for those subsets whose centroid vector correlates highly with the query, the query is matched against every docu- ment contained in the subset. This new clusterin& algorithm can be further modified by incorpora- ting user relevance judgments for each previous query introduced into the system. In step 2), instead of associating all those documents which were identified by their high correlation, it is possible to associate only those documents considered relevant to the query by the user.