IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Thesaurus, Phrase and Hierarchy Dictionaries
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
VII-55
the examples given show that the thesaurus and the hierarchy are often
successful because of the precision device effect achieved by the
weighting scheme. Confirmation of the fact that this phenomenon is largely
responsible for the improvements gained by thesaurus in the IRE-3 and ADI
collections is obtained from the data of Fig. 32, where performance
results from Figs. 5 and 6 are represented with, and without, the
weighting process to show how the thesaurus offers greater improvement
over stem when the weighting scheme is in usee The Cran-l collection
does not in this instance show this result, probably because of the
effect on the cosine correlation of the change in weighting; alter-
natively, it may be explained as yet another instance of the difference
in behavior between the Cran-l and the others.
8. Further Studies Required
Since the conclusions of this section have already been stated
in part 6, this final part enumerates some topic areas for further in-
vestigation that may be directly or indirectly suggested by the preceding
analysis. Eleven studies are listed:
1. The effectiveness of all five of the dictionary con-
struction rules must be established by the construction
of a series of versions of a given dictionary, so that
the relative importance of rules about word frequency
versus rules about synonymy can be established. As
a start in this direction, a second version of the
ADI semi-automatic thesaurus is under test.