ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Design Criteria for Automatic Information Systems
chapter
M. E. Lesk
G. Salton
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
v-8
and precision measures, where the recall is defined as the proportion of
relevant matter retrieved, while precision is the proportion of retrieved
material actually relevant. If a dual cut is made through the document
collection to distinguish retrieved items from nonretrieved on the one
hand, and relevant items from nonrelevant ones on the other, the two
measures may be defined as shown in Fig. 4. The compiitation of these
measures is straightforward only if exhaustive relevance judgments are
available for each document with respect to each search request, and if
the cut-off value distinguishing retrieved from nonretrieved material can
be unambiguously determined. [8,9,10]
In the evaluation work carried out with the SMART system, manually
derived, exhaustive relevance judgments could be used since the document
collections processed are all relatively small. Moreover, the choice of
a unique cut-off could be avoided by computing the precision for various
recall values, and exhibiting a plot showing recall against precision.
Recall-precision graphs, such as those shown in the remainder of this
study, have been criticized for a variety of reasons,[ll] but they are
very effective to summarize the performance of retrieval methods averaged
over many search requests, and they can be used advantageously to select
analysis methods which fit certain specific operating ranges. Thus, if it
is desired to pick a procedure which favors the retrieval of all relevant
material, then one must concentrate on the high recall region; similarly,
if only relevant material is wanted, the high precision region is of
importance. In general, it is possible to obtain high recall only at a
substantial cost in precision, and vice-versa. [8,9,10]
The following document collections have been used in the experiments