<DOC> 
<DOCNO> IRS13 </DOCNO>         
<TITLE> Scientific Report No. IRS-13 Information Storage and Retrieval </TITLE>         
<SUBTITLE> Thesaurus, Phrase and Hierarchy Dictionaries </SUBTITLE>         
<TYPE> chapter </TYPE>         
<PAGE CHAPTER="7" NUMBER="43">                   
<AUTHOR1> E. M. Keen </AUTHOR1>  
<PUBLISHER> Harvard University </PUBLISHER> 
<EDITOR1> Gerard Salton </EDITOR1> 
<COPYRIGHT MTH="December" DAY="" YEAR="1967" BY="National Science Foundation">   
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 
</COPYRIGHT> 
<BODY> 
                                                                   VII-43


        2.  For a high precision need, only IRE-3 produces some advantage

            to phrases (Figs. 21 and 25); however, this is based on

            a small superiority for one or two requests only, and is not

            considered significant (Figs. 26 and 27).

        3.  For a high recall need, use of the average rank of the last
            relevant shows the phrases to be useful on IRE-3 only (Fig. 25)

            by a small margin on an individual request basis (Fig. 26).


        Results comparing the thesaurus with the addition of various

hierarchy relations on the IRE-3 collection produce the following con-

clus ions:


        1.  Thesaurus alone is always superior to hierarchy on three of

            the relations tested, and on two others ("parents and "all"

            relations), the hierarchy gives a small advantage over

            portions of the precision recall curve (Figs. 22, 23). On

            an individual request basis (Fig 24), the thesaurus is

            equal to "parents", and superior to "all" relations; the

            hierarchy is thus not to be preferred.

        2.  For a high precision need, Fig. 25 suggests that some

            advantage accrues, but Fig. 27 shows that its success is

            limited to one or two requests that do badly with the thes-

            aurus alone.

        3.  For a high recall need, Fig. 25 shows that the hierarchy

            performs well, but Fig. 26 reveals again that it achieves

            only a few dramatically good results with a poorer average

            high recall per[OCRerr]ormance for individual requests than thesaurus.


7.  Performance Analyses

        The first task of the analysis is to explain the mechanism which

causes an improvement in retrieval performance using the thesaurus and

</BODY>                  
</PAGE>                  
</DOC>