ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval A Modified Two-Level Search Algorithm Using Request Clustering chapter V. R. Lesser Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. vii-8 these parameters are arbritary so that in order to validly compare alternative search procedures, the parameters would have to be adjusted to maximize the effectiveness of each search procedure. Therefore, a different algorithm which is not a function of the number of clusters nor the size of a cluster is used to calculate the number of documents to be completely searched. In order to generate the criterion for search effectiveness, the normal procedure for querying a document collection is altered: instead of considering a user request consisting of only a query together with a cut-off value for the correlation coefficient (only documents which correlate above the cut-off value are retrieved for each query) an additional parameter is included. This parameter specifies the number of documents to be retrieved. In this modified querying system, each search procedure is altered so that when the specified number of documents are retrieved, the search procedure terminate[OCRerr]. This modification permits the comparison of the minimum number of documents each search procedure must scan in order to satisfy the modified user request. There also must be available some measure of the extent of relevance of the documents retrieved by the alternative search procedures in relation to the documents retrieved by a full search of the document collection. Rocchio [6] in comparing the effectiveness of a two-level search algorithm based on his clustering algorithm against the effectiveness of a full search of the document collection uses the following criteria: 1) the `1consistency of retrieval [OCRerr]Tith respect to all documents,t1 i.e. the extent to which the reduced search leads to the retrieval of the same documents as the full search;