CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Test Environment
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
16 -
were obtained, comparing first a set of questions when only relevance
i documents were accepted as relevant, then with documents of relevance
1 or 2, next with documents of relevance 1 or 2 or 3, and finally with
documents of relevance 1 or 2 or 3 or 4. Apart from the particular
test to measure this variable, the broadest relevance decision, namely
1 - 4, was always used ill other tests.
The Composite Table
Some idea of the volume, variety and complexity of the tests
carried out can be seen from the composite table, (Fig. 2.10) which
gives results for various combinations of six variables tested on the single
term index languages 1.1 to 1.6. The basic set of questions used is
subset 1, which has 35 questions, each having seven starting terms,
but some of the results are based on two selections of these, namely
19 questions of subset 4 and 20 questions of subset 6. Four of the
variables are listed at the head of the table, and the other two at the
left side; the table divisions consist of the following factors:-
1. The coordination level varies from 1 to 7, which would result
in seven main sections of the tabie. However, due to
problems of presentation in this report, the table is truncated
by the omission of the figures relating to the first three
levels, so that it only presents four main sections covering
the coordination levels of 4, 5, 6 and 7.
2. Four search rules (A,B, C and D) are next varied, and are
applied in order of increasing intelligence within each
coordination level.
3. The precision devices (a, b, c and d) are recorded next, with
most results using no linking devices, apart from the three
columns near the centre of each section.
4. The final factor at the head of the table is document relevance,
with the three higher grades listed first, followed by the
lowest grade used for ali subsequent combinations (1, 1-2,
1-3, and 1-4).
5. The rows are first divided into five, representing the index
languages 1.1, i.2, 1.3, 1.5 and 1.6.
6. The final variable is indexing exhaustivity, the three levels
being repeated as divisions of each index language in turn.
The meaning of the codes used in this table has already been
described earlier in this chapter.
Th'e search results are shown as percentages for recall and precision.
Thus each set of recall and precision devices can be understood by
examining the columns above, and the row to the left of a set of ratios,
and then reading off the particular combination of variables being tested.
For example, if the first section of the table as printed is examined