IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Document Length
chapter
E. M. Keen
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
V-23
documentse Figure 9 gives the numbers of requests that favor abstracts and
the number that favor titles, using both normalized recall and normalized
precision, for six results. The data given reflect the fact that the pre-
cision/recall curves in Figures 6 and 7 closely represent the actual situa-
tion, namely, that abstracts are superior to titles, since between 57e9%
and 94.1% of the requests favor abstracts on the six runs, using normalized
recall (ties being ignored). The superiority of abstracts is again most evident
with the computer science collection, and least so in the aerodynamics collection.
Since the aerodynamics resultin Figure 6 produces a crossing curve, two
plots are given in Figure 10 of the normalized recall and normalized precision
values for each of the 42 requests, showing the magnitude of the differences,
comparing the 30 requests that favor abstracts and the 10 that favor titles.
For example, one request had a normalized recall difference of 0.34 between
abstracts and titles, while another request was better by 0.08 on titles
than abstracts. The requests are arranged in an order of decreasing dif-
ference, and it is seen, using both normalized recall and normalized pre-
cision, that although ten requests did perform better on titles, there are
ten requests that performed better on abstracts with a larger increase in
performance. This result does not explain the superiority of titles over
a small range in the middle of the precision recall curve seen in Figure 6 b,
so that further data are given in Figures 11 and 12 to explain this fact.
In these tables, the individual relevant documents are examined, and the ranks
of the 198 documents concerned are compared on abstracts and titles. Figure
11 shows that 99 are superior on abstracts, and 84 on titles, a close
result that accurately describes the situation. Figure 12 further breaks
down these 99 and 84 documents, showing by a series of 10 ranges, the difference
in rank positions for the 99 superior on abstracts, and the 84 superior