IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
An Analysis of the Documentation Requests
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
x-26
given to this concept c[OCRerr]ipared with stem, producing increases of 1
to 18 1/2, 2 to 11 1/2, 3 to 25 and 4 to 11 1/2.
These three reasons for superiority of the thesaurus process are
thought to be typical for other requests alsoe There is also a strong corre-
lation of reasons a) and b) between the 11A" and VIB?1 requests, since as Figure
15 shows the concepts `1c[OCRerr]puter11 and 11system11 appear a total 16 times in the
`1B" requests, and only 6 times in the "A" requests, thus giving the t1Bt1
requests greater opportunity to benefit from the superior handling of these
concepts in the thesaurus.
This treatment of the different sets of requests only scratches the
surface of the problem and points mainly to some of the factors [OCRerr]nown to
be involved.
D) The Recognition of In[OCRerr]ortant Request Words
The presence of quite specific and important single request words
and the problem of giving them a weight in proportion to their importance
was noted in part 3B. In order to discover whether increases in the weight
of such important words does improve retrieval performance, the 35 requests
were examined (without knowledge of search results, relevant documents or
concept frequency) to see whether such important concepts could easily be
{identified. Seventeen requests were found to possess such important concepts,
and each of the concepts was tripled in weight. These decisions are recorded
in Appendix B. This simulates a quite feasible requestor rule which would
as[OCRerr] for any important concepts to be underlined in the request statement,
and which the system could recognize and correspondingly increase in weight
by some factor. Six requests in addition to the seventeen were also slightly
modified, request Al was divided into two as suggested by part 3C; in request