ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval A Modified Two-Level Search Algorithm Using Request Clustering chapter V. R. Lesser Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. vi'-6 Consider as an example, a new query (0) introduced into the system, and assume that the new query lies between two document clusters. In the normal two level search scheme both document clusters will therefore have to be searched completely. However since the given query lies in a cluster of previous queries the centroid vector of only one query cluster will correlate highly with it. Therefore in the modified two-level search scheme, only the subset of documents which is associated with the one highly correlated query cluster needs to be searched completely. The question now arises as to why a query ([OCRerr] which lies between two query clusters could not occur with the same frequency as the query (0) used in the ex[OCRerr]rnple. Two basic assumptions underlie the modified two level search: 1) The set of previous query vectors to the system form dense subsets in the n-dimensional space so that the set of previous queries can be clustered into subsets based on the information content of each query; 2) A new query is on a statistical basis likely to be similar to a subset of [OCRerr]revious queries so that the new query can be assigned to a subset of previous queries created by the clustering algorithm. If these two assumptions are true, it is much less likely for a new query to lie between two query clusters than between two document clusters.