IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Thesaurus, Phrase and Hierarchy Dictionaries
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
VII-43
2. For a high precision need, only IRE-3 produces some advantage
to phrases (Figs. 21 and 25); however, this is based on
a small superiority for one or two requests only, and is not
considered significant (Figs. 26 and 27).
3. For a high recall need, use of the average rank of the last
relevant shows the phrases to be useful on IRE-3 only (Fig. 25)
by a small margin on an individual request basis (Fig. 26).
Results comparing the thesaurus with the addition of various
hierarchy relations on the IRE-3 collection produce the following con-
clus ions:
1. Thesaurus alone is always superior to hierarchy on three of
the relations tested, and on two others ("parents and "all"
relations), the hierarchy gives a small advantage over
portions of the precision recall curve (Figs. 22, 23). On
an individual request basis (Fig 24), the thesaurus is
equal to "parents", and superior to "all" relations; the
hierarchy is thus not to be preferred.
2. For a high precision need, Fig. 25 suggests that some
advantage accrues, but Fig. 27 shows that its success is
limited to one or two requests that do badly with the thes-
aurus alone.
3. For a high recall need, Fig. 25 shows that the hierarchy
performs well, but Fig. 26 reveals again that it achieves
only a few dramatically good results with a poorer average
high recall per[OCRerr]ormance for individual requests than thesaurus.
7. Performance Analyses
The first task of the analysis is to explain the mechanism which
causes an improvement in retrieval performance using the thesaurus and