ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval A Modified Two-Level Search Algorithm Using Request Clustering chapter V. R. Lesser Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. VII-is set; therefore, a collection of 200 queries was constructed to sinulate the first assumption. The idea motivating this technique was to produce a query vector which was similar to the initial query vector, but would possibly have different concepts and weights. It was felt that this perturbation of the initial query would sL[OCRerr]iulate a set of different users, phrasing the same type of query. 2) the second collection of queries used to simulate the first assumption consisted of the first set of 25 queries. The data for the modified two-level search [OCRerr]Ta5 constructed by sidering the two collections of queries described above as collections of previous queries introduced into the system. The follo[OCRerr]ng procedures were carried out for both collections of queries: 1) the standard clustering algorithm [OCRerr] used to partition the set of previous queries into sets of 6 and 8 clusters; 2) the subset of associated documents for each query cluster was constructed by associating all those documents which correlated highly [OCRerr]rith the given query centroid vector; the size of the associated subset of documents depended on the number of queries contained in the given query cluster. [OCRerr] the size dependent on the magnitude of the document correlations [OCRerr][OCRerr]th the centroid vector was also tried, but for the document collection used the associated subsets of documents turned out to have the same size for either procedure. This procedure [OCRerr] then repeated for the 6 and 8 clusters of queries. 3) the non-associated documents resulting from step 2) were clustered into two categories; this was done so that the document collection was partitioned intb sets of 8 and 10 clusters. Therefore, the number of categories for the modified and normal two-level search schemes were equal.