ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval A Modified Two-Level Search Algorithm Using Request Clustering chapter V. R. Lesser Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. VII-9 2) the [OCRerr] of retrieval with respect to relevant documents, i.e. the extent to which the retrieval of the relevant documents iz altered by the reduced search. The above criteria are based on the amount of information lost when the documents are retrieved by a partial search of the document collection instead of by a full search. It is believed that in conjunction the two criteria for effectiveness provide adequate data for an appraisal of the modified two-level search scheme compared with the normal two-level search scheme. In the modified querying system proposed for testing, Rocchio1s two criteria take the follo[OCRerr][OCRerr]ng form: 1) the overlap percentage between the retrieved set of documents * obtained by the partial search [OCRerr][OCRerr]th the first n documents retrieved by the full search; 2) the normal recall or the percentage of relevant documents retrieved by the partial search to the number of relevant documents contained in the first n* documents retrieved by the full search. C) Implementation of the Normal and Modified Two-Level Search Schemes Each search procedure relies heavily on the particular clustering algorithm used, and the parameters used by the cluster algorithm to determine how the document collection is to be partitioned. It [OCRerr] decided, based on a search of the literature, that Rocchio's clustering algorithm [6] would be the most suitable. The parameters that are used * n = the ni[OCRerr]ber of documents to be retrieved originally specified by the user for the partial search of the document collection.