IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Summary
summary
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
Summary
The present report is the thirteenth in a series covering research
in automatic storage and retrieval conducted by the Department of Computer
Science at Cornell University with the assistance of the Division of
Engineering and Applied Physics at Harvard University. The present report
contains a detailed analysis of the retrieval evaluation results obtained
with the automatic SMAP[OCRerr] document retrieval system over the last few years
for document collections in the fields of aerodynamics, computer science,
and documentation. The various components of fully automatic document
retrieval systems are covered in detail, including the form of input
(section V), automatic content analysis methods (sections VI, VII, VIII,
and IX), and the matching procedures used to compare documents and search
requests (sections III and IV). The complete test environment and the
parameters which enter into the evaluation process are also described
(sections I and II).
Unlike its predecessor (report ISR-12), the present report does
not cover the iterative search procedures based on user feedback, or the
partial cluster searches designed to speed up the search process, but
confines itself to the treatment of the standard fully-automatic analysis
and search procedures incorporated into the SMART system, and supplements
the summary titled "Computer Evaluation of Indexing and Text Processing",
previously published as section III of report ISR-12. Preliminary retrieval
results for relevance feedback and cluster searching are contained in
section V of report ISR-12; a more definitive treatment of user-controlled
xi