CRANV1P1 ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text Documents and Questions chapter Cyril Cleverdon Jack Mills Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. - 30 - "... the 'assessment of relevance' categories seemed particularly difficult to interpret in relation to most of these additional documents. I believe that I have 'scored' the documents roughly in proportion to the degree of ir- ritation I should feel if a librarian produced them in res- ponse to my original query. Whether this is a proper basis for measurement of relevance may be arguable!" The relevance assessments that the authors made of their own cited papers reveal some information on the citation habits of authors, but any observations can only be made within the limits of this situation, in which in most cases only a selection of the cited papers was used. A few of the authors assessed all their cited papers as not relevant to the basic questions, and one explicitly stated that he did not find any relevant at all. An analysis of 174 of the basic questions, more than was ultimately used, shows that 36% of the cited papers submi,tted were assessed as not relevant, and if marginally relevant papers graded (4} are included, the figure is 52%. The results from the 118 basic questions in Table 3.6 give results of 28% and 46% respectively. It may be concluded that about half the references in an author's paper are not included in connection with the main problem of the paper, a fact which may assist examination of the possibilities, and limitations, of bibliographic coupling and citation indexing. There were some cases where a cited document was not strictly relevant to any of the search questions at all, as one author honestly explained:- "I have had some difficulty in classifying some of my references into the required categories: chiefly those which occur at the beginning of the report when I attempt to relate this report to my own previous work. It is dif- ficult to know whether they should be categorised as 3, 4, or 5: from the librariar[OCRerr]s point of view they should probably be in category 5, but it is not easy to admit that several of one's references are, strictly, irrelevant to all the questions discussed. " Another good explanation for this case was:- "In the particular paper of mine a number of references are included, not to give information on the basic search question, nor do they arise from any subsidiary ques- tions; rather they are included to amplify certain details in the text. For example the first three references of my paper are included purely to save time and words in the report, as I felt it completely unnecessary to describe experimental equipment which had been described fully elsewhere. Thus the first three references merit a 'five' rating. " One author supplied us with his reasons for inclusion of six of his references. "My assessments of reference 3, 6 and 9 refer really to many papers of which these are typical examples; No. 8 was not located - it just happened to turn up at the right time; No. 4 did not come to hand until after the work was completed and the report nearly so; No. 11 was included merely in order to satisfy anyone who wanted a long list. "