ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Summary
summary
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
The concordance program described in section III is presently being
distributed through the SHAlE organization.
In section Iv by M. Lesk and G. Salton the complete information
dissemination process is examined with emphasis on the use and construction
by automatic or semi-automatic techniques of syno[OCRerr] dictionaries and
hierarchical subject arrangements. One specific proposal for the fully
automatic construction of subject hierarchies is presented in section
VIII by G. Bloggren, A. Goodman, and L. Kelly. It is shown in particular
how the structure of the hierarchical arrangement changes as various
parameters are changed.
Section V of this report by M. Lesk and G. Salton contains in summary
form the systems evaluation output produced by the SM[OCRerr]T system, based on
extensive operations with four document collections in three subject fields
(documentation, caii[OCRerr]uter science, and aerodynamics). One document collection
used in the experiments consists of document abstracts manually indexed by
trained indexers, thus permitting a comparison between the effectiveness
of the standard [OCRerr]eyword rn[OCRerr]tching techniques and the automatic analysis
procedures incorporated into the S[OCRerr]T system. Another collection was
available in the form of abstracts as well as longer summaries, thus
permitting an e[OCRerr]valuation of the effects of document length.
Three sections are devoted to a study of iterative search techniques
and user feedback techniques, including sections VI, VII, and Ix. Section
VI by W. Riddle, T. Horwitz, and R. Dietz examines the effecti[OCRerr]ene.ss of a
variety of relevance feedback procedures in which the users supply to the
system relevance judgments about documents previously retrieved. These
xv