IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Thesaurus, Phrase and Hierarchy Dictionaries
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
VII-49
Evaluation of the semi-automatic luHastiell thesaurus-SAl on ADI
must await the testing of a further version of this thesaurus. However,
the tentative conclusions are that this method is not workable in practice,
owing to the difficulty of generating suitable property questions, and
the need to re-sort the resulting groups using frequency information.
Some further developments may provide solutions to these problems.
Examination of individual requests using the phrases shows that
no dramatic performance changes take place, and in general, the phrases
do not give a significant advantage even for the IRE-3 collection. Part
of the reason for this is the small number of phrases included in the
dictionaries. Also, use of phrases to replace the occurrences of the
individual component concepts would probably alter the request and docu-
ment vectors by a greater amount than the present procedure of simply
adding phrase concepts; performance changes (either better or worse)
would then be more clearly seen.
Results using the hierarchy show it to be very effective for only
a few individual requests. An examination of all requests immediately
shows that the 17 staff prepared requests behave differently from the 17
non-staff prepared ones, and Fig. 30 shows that there is a tendency for
hierarchy to be more effective on the non-staff requests than the staff
ones. It was seen in Section I that the staff requests have a much better
performance than the non-staff requests, therefore there is less room for
improvement with hierarchy for these requests, and the extra hierarchy
identifiers only serve to increase the match with non-relevant documents.
The non-staff requests have exhibited a poor performance with thesaurus,
and thus leave room for improvement by additional dictionary grouping
(which is what the hierarchy does).