ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Appendix A: The Smart System
appendix
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
A-3
Document index images enerated by this process may be
subjected to a number of additional modifications. A variety of
transformations based on a pre-specified hierarchical structuring of
the elements of the index language is provided. Alternatively,
relations among index terms derived from statistical associations in
a given collection may be used for modifying index images. The
system, therefore, may provide a variety of representations for input
documents based on the initial dictionary lockup and subsequent
transformation rules defined on the index language. ([OCRerr]ote that the
index images used experimentally for this thesis were generated by a
lookup using version 2 of the SMAI[OCRerr]T thesaurus with no. phrase
detection and no additional semantic transformations.)
[OCRerr]. Search Thequest Formulation
Search requests in the SNART system are introduced.directly
in the natural language and may be treated exactly as are document
texts. Requests, therefore, may be subjected to all or any subset
the content analysis procedures available for document processing.
In addition to varying the index image of a search request by the
sequence of analysis procedures to which it is subjected, a number
additional query modification procedures4 (including the relevance
feedback technique discussed in[OCRerr] chapter [OCRerr]) are being considered for
inclusion into the system.
of
of
C. Q[OCRerr]ery-Document Matching
The flexibility provided by the computer allows the S[OCRerr]ART