CRANV1P1 ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text Documents and Questions chapter Cyril Cleverdon Jack Mills Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. - 21 - 2. To assess the relevance of each of the submitted list of papers which had been cited as references, in relation to each of the questions given. The assess- ment was to be based on the following scale of five definitions: (i) References which are a complete answer to the question. Presum- ably this would only apply for supplementary questions, since if they applied to the main question there would have been no necessity for the research to be done. (ii} References of a high degree of relevance, the lack of which either would have made the research impracticable or would have resulted in a considerable amount of extra work. {iii) References which were useful, either as general background to the work or as suggesting methods of tackling certain aspects of the work. (iv) References of minimum interest, for example, those that have been included from an historical viewpoint. (v) References of no interest. An example of a completed sheet was included with each letter; this, the cover- ing letter and example of material sent, are shown as Appendix 3 A. It was originally expected that half the authors would complete the form to our requirements, and that there would be an average of two questions with each reply. During early March, 1963, 82 letters were sent out and by the end of that month 47 replies had been received with an average of 3½ questions. Further letters were despatched up to the middle of July, and then later one chase letter was sent to those who had not replied. By the end of September we had received the excellent response of 182 completed forms of the 271 sent (67.2%). Some authors wrote to say that they could not spare the time; many other letters were returned because change of address prevented delivery. The authors continued to suppty an average of 3½ questions, and the total of those received was 641. Most of these authors, 67.6% lived in the U. S. A. , with 26.9% in Great Britain and 5.5% in other countries. Table 3.2 shows the figures from each country, based on the 182 authors with whom we corresponded. A complete list of the authors is given in Appendix 3F. It is an interesting sidelight on publishing habits to notice that eight of the British authors published in American sources, and nine out of ten of the other foreign authors did the same, but all the authors residing in the U. S.A. published there. Figures are given in Table 3.3. Some of the authors had changed their country of residence by the time of the test,. and the figures are based on the country of residence in which their particular research paper was written. As the forms were being received, the document collection was being made up, and 1,018 unique documents resulted from the cited papers. The base documents themselves were also included in the collection, adding 173 more documents (9 were already included as cited papers}, but in order to avoid any possible bias in the results, these base documents are always completely deleted from theresults when the questions to which they gave rise are being tested, 209 further documents, taken from similar sources, brough the whole collection to its final 1,400 documents. For the indexing, which was proceeding during this time, single xerox copies of the documents were made. Full bibliographical information concerning the document collection is given in Appendix 3C. To prepare for the next stage, 361 of the 640 questions were selected for use in the test. The basis for this selection was questions that had two or more docu- ments assessed as relevance grade 1, 2 or 3, and questions that were grammatically complete were selected first. Some questions were received abbreviated, although the missing idea was quite clear from another of the author's questions. For example,