IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
An Analysis of the Documentation Requests
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
X-l~
a better performance than the general ones, although normalized precision
shows only a small difference. There is no correlation at all between request
generality and dictionary in these resultse
Request length results are given in Figure 5, with the requests
again divided into two sets. The long requests perform better than the
short ones, but do so by a much greater amount with the stem dictionary than
the thesaurus dictionary, so that for the long requests normalized recall
shows stem to be slightly superior to the thesaurus. This correlation suggests
that the generally inferior stem dictionary may be adequate for long requests.
Requests may also be characterized by the frequency of use in the
collection of the request concepts. Two methods of obtaining averages for the
35 requests are given in Figure 6, each method supplying an arithmetic mean
and a median value. The average frequency per average request concept has
been found to be the more satisfactory of the two, and requests are again
divided into two sets by this principle in Figure 7. Requests having low
frequencies per average concept are seen to perform best, with no real
differences between stem and thesaurus dictionaries.
It is to be expected that these three characteristics of generality,
length and concept frequency are strongly inter-connected, since specific re-
quests are probably often long ones, and also probably have low average con-
cept frequencies. A visual representation of the correspondence between the
three characteristics is given in Figure 8; in Figure 9 it is shown that
19 of the 35 requests fall exactly into the two expected combinations of three
characteristics each. These characteristics seem to be the only available
objective means of stating whether requests are broad or narrow in a subject
field sense; although perfect correspondence is not obtained, there is