IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
An Analysis of the Documentation Requests
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
X-37
synonyms made on an individual request basis should work better than the
obligating use of the pre-constructed set of synonyms contained in the the-
saurus. The hand searches do not use any feedback to obtain a fair comparison,
since the search keywords were chosen before any reference was made to the
KWIC concordance. The result of SMART using the hand modified "important11
concept with increased weights is included in Figure 21 b); the curve now
lies much closer to the hand result. Naturally a hand system permitting
coordinate keyword searches would extend the hand curve to higher precision
values, and choices of more than five keywords per request would enable higher
recall ratios to be reached.
This result does not condemn the automatic indexing procedures,
because hand searches are less easy to conduct in the sort of situation
in which SMART would operate, such as a large file of individually long docu-
ment surrogates. It is clear also that in an operational use of SMART;
search strategies employing several dictionaries in a variety of possible
ways could be used, and for users willing to employ some intellect to
strongly interact with the system, quite large performance gains may be ex-
pected. Ways in which a system might operate are: the use of several dic-
tionaries successively, until the required performance is reached; the use
of several dictionaries with 11merged11 output results [6]; use of a manual [OCRerr]
or automatic method of making an accurate pre-search best dictionary choice,
yet to be developed; use of dictionary display methods to allow users willing
to strongly interact to delete or add synonyms; and the use of relevance
feedback methods to iterate searches and improve performance.
Of these suggestions the idea of making a pre-search dictionary
choice has been explored but with no success so far. If, for example, long
requests work better with the stem dictionary, and short requests need the