CRANV1P1
ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text
Documents and Questions
chapter
Cyril Cleverdon
Jack Mills
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 21 -
2. To assess the relevance of each of the submitted list of papers which had
been cited as references, in relation to each of the questions given. The assess-
ment was to be based on the following scale of five definitions:
(i) References which are a complete answer to the question. Presum-
ably this would only apply for supplementary questions, since if they applied to the
main question there would have been no necessity for the research to be done.
(ii} References of a high degree of relevance, the lack of which either
would have made the research impracticable or would have resulted in a considerable
amount of extra work.
{iii) References which were useful, either as general background to the
work or as suggesting methods of tackling certain aspects of the work.
(iv) References of minimum interest, for example, those that have been
included from an historical viewpoint.
(v) References of no interest.
An example of a completed sheet was included with each letter; this, the cover-
ing letter and example of material sent, are shown as Appendix 3 A.
It was originally expected that half the authors would complete the form to our
requirements, and that there would be an average of two questions with each reply.
During early March, 1963, 82 letters were sent out and by the end of that month 47
replies had been received with an average of 3½ questions. Further letters were
despatched up to the middle of July, and then later one chase letter was sent to those
who had not replied. By the end of September we had received the excellent response
of 182 completed forms of the 271 sent (67.2%). Some authors wrote to say that they
could not spare the time; many other letters were returned because change of address
prevented delivery. The authors continued to suppty an average of 3½ questions, and
the total of those received was 641.
Most of these authors, 67.6% lived in the U. S. A. , with 26.9% in
Great Britain and 5.5% in other countries. Table 3.2 shows the figures from
each country, based on the 182 authors with whom we corresponded. A complete
list of the authors is given in Appendix 3F. It is an interesting sidelight on
publishing habits to notice that eight of the British authors published in American
sources, and nine out of ten of the other foreign authors did the same, but all the
authors residing in the U. S.A. published there. Figures are given in Table 3.3.
Some of the authors had changed their country of residence by the time of the test,.
and the figures are based on the country of residence in which their particular
research paper was written.
As the forms were being received, the document collection was being made
up, and 1,018 unique documents resulted from the cited papers. The base
documents themselves were also included in the collection, adding 173 more
documents (9 were already included as cited papers}, but in order to avoid any
possible bias in the results, these base documents are always completely deleted
from theresults when the questions to which they gave rise are being tested,
209 further documents, taken from similar sources, brough the whole collection
to its final 1,400 documents. For the indexing, which was proceeding during this
time, single xerox copies of the documents were made. Full bibliographical
information concerning the document collection is given in Appendix 3C.
To prepare for the next stage, 361 of the 640 questions were selected for use
in the test. The basis for this selection was questions that had two or more docu-
ments assessed as relevance grade 1, 2 or 3, and questions that were grammatically
complete were selected first. Some questions were received abbreviated, although
the missing idea was quite clear from another of the author's questions. For example,