ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
Synopsis
synopsis
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
The indexing component of the model is discussed in chapter 2
which is primarily descriptive in nature. The work of a number of
researchers in this area is cited in ciescrioi[OCRerr]C [OCRerr]e development and
current trends of a[OCRerr]tomatic content analysis. Particular emphasis is
placed on concept vector indexing techniques w;[OCRerr]ich incorporate
thesaurus type semantic associations. The S[OCRerr]T system generates
document and query index images by such techniques and these were used
in obtaining the experimental results presented in ehapters 3 and 4.
The[OCRerr]original contributions in chapter 2 are related to the techniques
proposed for index image optimization.
A search request optimization algorithm analytically derived
from an assumed optimalitycriterion is presented.in[OCRerr]hapter 3. The
optimization algorithm and the[OCRerr] notion of request optimization by an
iterative sequence of retrieval operations are original with the author.
In addition the notion of testing index language devices by the use of
optimal search requests is original. Experimental results illustrating
the optimization process are presented. These were derived by a
simulation which was coded and run on the IBM 7O[OCRerr]4 in cbnjunction with
the S[OCRerr]ART retrieval system. A search request formulation based on this
optimization technique offers the promise of improving' the performance
of document retrieval sYstems, given[OCRerr]the current state' of development
of computer time sharing and man-machine communication technology.
Chapter 4 pre[OCRerr]ents `an original automatic document
[OCRerr]clas'sification algorithm, heuristically motivated by considerations of
search'e'fficiency and by the functional nature of query-document
matching operations. This algorithm was coded in Fortran and run on
xvi