ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval The SMART System -- Retrieval Results and Future Plans chapter G. Salton Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. `4 should syntactical relations be[OCRerr][OCRerr]en subject identifiers be preserved; does the user have an important role to fulfill in controlling the search procedure. These and many other questions are answered by the following rules derived from the evaluation results, and described in greater detail in the remainder of this report. [l,[OCRerr],6] In each case the evaluation is made in terms of two measures, known as recall and precision, which reflect, respectively, the ability of the system to retrieve [OCRerr][OCRerr]ted material, and its ability to reject nonwanted items: 1) The use of document titles alone for purposes of information analysis results in poor retrieval performance compared with the use of abstracts or full text. 2) The use of information identifiers which are weighted in accordance with their presumed importance leads to large-scale improvements in retrieval effectiveness, compared with the use of unweighted terms. 3) Dictionaries providing synonym recognition are of considerable help in improving retrieval performance, particularly when they reflect the properties of the vocabulary under consideration. [OCRerr]) Absolute accuracy in the analysis of every single item is not so important as the accumulation of a maxiam number of correctly analyzed items. If a choice exists between a method which can produce one guaranteed correct content indication (syntactic analysis), and another which produces five indicators of [OCRerr]hich four are probably correct (statistical phrase process), the second is generally to be preferred. 5) Simple phrase generation methods lead to a definite improvement in recall at the expense of some initial loss in precision in the low recall region.