IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval An Analysis of the Documentation Requests chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. X-l~ a better performance than the general ones, although normalized precision shows only a small difference. There is no correlation at all between request generality and dictionary in these resultse Request length results are given in Figure 5, with the requests again divided into two sets. The long requests perform better than the short ones, but do so by a much greater amount with the stem dictionary than the thesaurus dictionary, so that for the long requests normalized recall shows stem to be slightly superior to the thesaurus. This correlation suggests that the generally inferior stem dictionary may be adequate for long requests. Requests may also be characterized by the frequency of use in the collection of the request concepts. Two methods of obtaining averages for the 35 requests are given in Figure 6, each method supplying an arithmetic mean and a median value. The average frequency per average request concept has been found to be the more satisfactory of the two, and requests are again divided into two sets by this principle in Figure 7. Requests having low frequencies per average concept are seen to perform best, with no real differences between stem and thesaurus dictionaries. It is to be expected that these three characteristics of generality, length and concept frequency are strongly inter-connected, since specific re- quests are probably often long ones, and also probably have low average con- cept frequencies. A visual representation of the correspondence between the three characteristics is given in Figure 8; in Figure 9 it is shown that 19 of the 35 requests fall exactly into the two expected combinations of three characteristics each. These characteristics seem to be the only available objective means of stating whether requests are broad or narrow in a subject field sense; although perfect correspondence is not obtained, there is