CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Conclusions
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 260 -
".made up of four terms A, B, C and D, each of which have been used five
times in the indexing of a set of documents as follows (x represents any
other term or terms also used in indexing the documents):
Document Index
Number Terms
1 ADx
2 x
3 ACx
4 x
5 BCDx
6 x
7 Bx
8 x
9 ABCDx
i0 x
l i BDx
12 x
13 Ax
14 x
15 BCDx
16 x
17 Cx
18 x
19 Ax
20 x
Searches for any combination of A, B, C and D would result in
retrieval at various coordination levels as follows:
Coordination Level No. of Documents Retrieved
4 1 (Document 9)
3 3 (Document 9, 5, 15)
2 6 (Document 9, 5, 15, i, 3,
1 10 (Document 9, 5, 15, 1, 3,
7, 13, 17, 19)
11)
ii
The particular significance of this point is the effect on retrieval
performance of enlarging the classes. Assume that the search terms are
broadened by being grouped with a related term, A1, B1. C1 or D1, and
that these related terms have also each been used five times in the same
set of 20 documents, the indexing being as follows:
Thus the sum of the retrievals (1+3+6+10=20) is the same as the total
number of postings for the four terms.