CRANV2 Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2 Test Environment chapter Cyril Cleverdon Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 28 - is always given against any figures which have been calculated on a reduced set. At a single term level, it can be seen that no searches were made, and therefore no figures can be estimated for precision or fallout ratios. There were various possible procedures for es'tirnating these figures, and these can be illustrated by reference to Fig. 2.15, which deals with the 35 questions subset searched on 1400 documents by Index Language 1.5.a. Since all the questions had seven starting terms, z remains constant throughout. However, at a coordination level of 2, it is shown in column y that only 23 questions were searched. It was found that, with these 23 questions 8, 565 non-relevant documents were retrieved together with 157 relevant documents. The simplest way of estimating the total non-relevant for the complete subset of 35 questions would be 35 to scale up the above figure of 8,565 in the ratio of [OCRerr]-[OCRerr] , which would give a total of 13,033 non-relevant documents. On the basis of this figure the precision and fallout ratios* could now be calculated. A second method is first to determine the precision ratio for the 23 questions searched; in this case it works out at 1.8%. It is known that the 35 questions retrieved 253 relevant documents; to maintain the precision ratio 253 of 1.8% the total of non-relevant is scaled up by [OCRerr] , namely the totals of relevant document[OCRerr] retrieved in the full set and in the subset. This gives a figure of 13,803 and from this the fallout ratio can be calculated. The accuracy of these scaled up results will depend on whether the sample of questions that were searched is typical of the whole set. It is unlikely that this was tile case; as stated earlier, questions were not searched when they would retrieve an excessive number of non- relevant documents, so conversely the questions which were searched, and which are therefore in the sample, were those which had fewer non-relevant documents. Scaling-up from the sample could therefore be expected to give a somewhat higher precision figure than was really the case. To check on this, we can consider the actual situation in regard to the same set of questions with Index Language I.l.a, on which, as previously mentioned, searches were made down to the single-term level. In this language, at a coordination level of 2, the 23 questions retrieved 3871 documents. By the methods already suggested, the estimated figures would have been 6043 and 6476 respectively. In fact,, the correct figure is 8086, and bears out the expectation expressed in the previous paragraph. This was also checked at the coordination level of 3, and again it was found that the remaining 12 searches retrieved '*The method .of ealculating these ratios, is discussed in Chapter 3.