CRANV1P1
ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text
Documents and Questions
chapter
Cyril Cleverdon
Jack Mills
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 24 -
Most authors included the weighting of the questions in their reply, over half
of the questions had some alternative terms added, and 28 of the questions were sub-
mitted in rephrased form. (See Appendix 3B J.
A summary of the position regarding the questions is as follows:-
1. Total of questions received ............ 641
2. Questions discarded for various reasons ........ 280
3. Questions matched against complete document collection for
relevance { (1} - {2} } .............. 361
4. Questions having no additional relevant references .... 78
5. Questions resubmitted to authors for relevance decisions.. 283
6. Questions returned by authors from stage (5) ...... 201
7. Questions available for test { (4) + {6) } ........ 279
The relevance assessments
The basic data on the authors' relevance assessments is given in Tables 3.4,
3.5, 3.6 and 3.7. These tables highlight various aspects of the relevance assess-
ments, and the figures given are taken from the 279 usable questions obtained. In
each table, the documents that were submitted to the authors are split into three
categories: -
1. Those cited in the author's own original paper;
2. Those the students found and judged as being relevant;
3. Those retrieved by bibliographic coupling at a strength of 7 plus, and which were
additional to the two categories above.
Each table also gives a figure for the total of all categories, the four divisions
being shown as the left hand parameter in each table. The relevance assessments
made are given in the body of the tables, these being split into several categories:-
1°
2.
3.
4.
submitted.
Documents submitted (Tables 3.4 and 3.6)
Documents assessed as relevant, i.e. accepted:- (a} Totals (Tables 3.4, 3.5, 3.8 and 3.7)
(b) Details of the four grades of Relevance (Tables 3.5 and 3.7)
Documents assessed as not relevant, i.e. rejected (Tables 3.4 and 3.6)
Total documents assessed [OCRerr]:[OCRerr]s relevant expressed as a percentage of documents
(Tables 3.4 and 3.6).
The figures given are in two forms in each table:-
1. Grand totals of documents, resulting from the whole get of questions involved.
2. Figures for one average question, calculated by the arithmetic mean. These
averages are correct to one decimal place, but in a few cases a slight adjustment
has been made to preserve the correct totals.
Tables 3.4 and 3.5 giving the figures for the whole set of 279 questions will be
examined first. The bottom section of Table 3.4 shows that 3,087 documents
were submitted to the authors of which 1,126 were rejected as not relevant, and 1,961
(i.e. 63.5%) were accepted as relevant. Table 3.5 gives a breakdown of the 1,961
documents accepted, showing that 171 were graded relevance (1}, 461 were relevance {2},