IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Test Environment
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
1-13
work is needed. An analysis of every instance of failure for every request
in each experiment would be an impossibly large task; a judicious selection
must therefore be made. Most of the sections in this report set out first
to present the average results for a series of experiments, and then to
make a fast-search performance analysis to uncover details and explanations
for the search results obtained.
Since real user populations and currently growing collections are
not available it is correct to describe the experimental procedures used
as "Simulated Search Methods as does R. V. Katter in (5]. Katter criticizes
such experimental techniques on several grounds: in particular, he says that
mechanical type matching is unnecessary and cumbersome. Since the work
Feported by Katter does not tackle any problem other than human judgment
reliability, his comments do not seem to apply to experimentation that deals
with a total system, which are designed to evaluate performance from a user
viewpoint. Search procedures used by SMART are not cumbersome, and simulated
searches are believed to be necessary in order to provide useful relation-
ships to reality.
B) Variables Tested
At the input stage, the use of natural language by SMART implies
that there are not `input variables to be tested, since full text processing
of documents has not been attempted in many different subject areas. Different
lengths of documents are therefore used, such as titles only, or abstracts.
Some tests using variables of this type are covered in Section V.
Content analysis procedures in SMART are performed by using a series
of dictionaries which differ in construction and effectiveness. The