ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Design Criteria for Automatic Information Systems chapter M. E. Lesk G. Salton Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. v-8 and precision measures, where the recall is defined as the proportion of relevant matter retrieved, while precision is the proportion of retrieved material actually relevant. If a dual cut is made through the document collection to distinguish retrieved items from nonretrieved on the one hand, and relevant items from nonrelevant ones on the other, the two measures may be defined as shown in Fig. 4. The compiitation of these measures is straightforward only if exhaustive relevance judgments are available for each document with respect to each search request, and if the cut-off value distinguishing retrieved from nonretrieved material can be unambiguously determined. [8,9,10] In the evaluation work carried out with the SMART system, manually derived, exhaustive relevance judgments could be used since the document collections processed are all relatively small. Moreover, the choice of a unique cut-off could be avoided by computing the precision for various recall values, and exhibiting a plot showing recall against precision. Recall-precision graphs, such as those shown in the remainder of this study, have been criticized for a variety of reasons,[ll] but they are very effective to summarize the performance of retrieval methods averaged over many search requests, and they can be used advantageously to select analysis methods which fit certain specific operating ranges. Thus, if it is desired to pick a procedure which favors the retrieval of all relevant material, then one must concentrate on the high recall region; similarly, if only relevant material is wanted, the high precision region is of importance. In general, it is possible to obtain high recall only at a substantial cost in precision, and vice-versa. [8,9,10] The following document collections have been used in the experiments