ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
The SMART System -- Retrieval Results and Future Plans
chapter
G. Salton
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
1-2
manual ke[OCRerr][OCRerr]rd search process shows that many automatic prdcedures are
fully as effective in retrieving useful materials and in rejecting useless
ones as are the better known manual procedures. [i]
Since an information system, whether manual or automatic, may be
expected to service a large variety of customers, each of whom may have
different needs and different background, it is unreasonable to suggest
that a single search of some part of the collection would prove equally
useful for all customers at all times. Accordingly, more emphasis has
been placed in the recent past on search experiments using storage organi-
zations and search strategies which make it possible for the user to
influence the search results by submitting to the system appropriate
feedback information. A given search is then undertaken iteratively by
processing the same search request several times, while altering the search
conditions for each iteration. Such iterative retrieval techniques are
particularly well adapted to automatic time-sharing equipment [OCRerr][OCRerr]here customers
can conmumicate directly [OCRerr]71th the system by means of suitable input-output
equipment. [2,3]
Many different user feedback strategies have been considered experi-
mentally [[OCRerr]], as well as a variety of search strategies. Some search
strategies, based on' the construction of groups of related documents, and
groups of related search requests seem particularly promising, since they
make it possible to obtain effective retrieval performance by co[OCRerr]aring a
given search request against only a small number of selected documents,
instead of performing a full search of the collection. [5,6]
The procedure making use of document groups, or clusters, is'based on
the identification of certain document subsets similar in some sense to