ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
The SMART System -- Retrieval Results and Future Plans
chapter
G. Salton
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
respect to search requests were made variously by project members, by
university students drafted for the purpose, and in the case of the Cranf[OCRerr]eld
aeronautics collection by scientists and experts in the field. The same is
true for the original request formulations. The output results obtained
under all these different conditions were, however, substantially similar
between different methods. Unquestionably, some of the relevance [OCRerr]udgments
used were incorrect, but if they were incorrect f[OCRerr]r one method, they [OCRerr]
sL'[OCRerr]1arly faulty for the others, and the bias, if any, [OCRerr]eemed to operate in
the same direction in each case. Furthermore, the Cranfield relevance
[OCRerr]udgments, made by scientists under carefully controlled conditions, are
subject to exactly the same challenges, as tho[OCRerr]e made by students and staff.
The hand-indexing available for the Cranfield aeronautics collection
was made by two or three trained inde[OCRerr]:ers with some help from subject
e[OCRerr][OCRerr]erts. Since the collection size was small an unusual degree of
consistency would seem to have been maintained [OCRerr][OCRerr]rthermore, the degree of
inde::ing was unusually deep, consisting of an average of over thirty terms
for each document. If that indexing is not typical, then surely it is
because normal ke[OCRerr][OCRerr]ord indexing cannot proceed under the same controlled
conditions for large collections, and the search results for larger
collections may be expected to exhibit an even clearer advantage for the
automatic procedures.
Still, when all is said ai'd done, it is clear that some of the afore-
mentioned objections can only be stilled by operating [OCRerr]Tith larger than
token collections, and hopefully by tying the experimental system into a
real user environment. The following SMART experiments are therefore