CRANV2 Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2 Simulated ranking and document output cut-off chapter Cyril Cleverdon Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. - 200 - Such recall and precision figures can be plotted on a conventional graph as in Fig. 5.4P, which shows the results of index language I.l.a {as in Fig. 5.3T} and also index language 1.9.a. These curves can be compared with Fig. 4. 206P and show the same superiority of index language I.l.a over index language 1.9.a. There is, however, an important difference. The positions of the points in Fig. 4.206P were determined by coordination level cut-offs, and were therefore random in relation to each other. With Fig. 5.4P, if sti'aight lines are drawn radiating from the point of origin, these will. as can be seen, pass through the corresponding points in each curve. This is due to the fact that the cut-off is based on document output, and recall and precision ratios are now interdependent. It is known that there are 198 documents relevant to the 42 questions, so, on average, 4.7 documents are relevant to each question. When only one document is retrieved for each question, even ff every such document were relevant, [OCRerr]he recall ratio could not possibly be higher than [OCRerr] = 21.2%, although it would, of course, represent a precision rSt]5 of 100%. If any of the documents are not relevant, then the recall ratio will always fall on some point along the line which goes from the point of origin to a recall of 21.2% at 100% precision. Therefore at any given document output cut-off, a drop in recall ratio with any one system as against any other system must also involve a drop in the precision ratio. Similarly, when two documents are retrieved in each search, the maximum recall ratio is 42.4% and with this particular document]question set, 100% recall cannot possibly be reached until at least five documents are retrieved for each question. This would, however, represent a total of 210 documents. Since there are only 198 relevant documents in the collection, the theoretical maximum precision ratio would then be [OCRerr] x 100 : 94.3%. As more documents are retrieved, so the maximum possible precision ratio must drop, and these document output cut-off performance lines can be calculated as has been done in Fig. 5.4P. Because of the fact that Question 141 had only one relevant document, it would not be possible in this collection to obtain the theoretically maximum figures for recall and precision beyond the single document cut-off level. Similarly, there are thirteen questions which have more than five relevant documents, and 100% recall could not possibly be obtained until twelve documents have been retrieved, this number representing the highest figure for documents relevant to a single question. This does not affect the position of the lines, which would be different, however, for other situations where there are more or less relevant documents per question. As previously mentioned, it is not possible to obtain the theoretically maximum performance beyond the single document output cut-off, since Q141 has only one relevant document. As ten questions have only two relevant documents, there must be a further deviation from the theoretical maximum beyond this stage. In Fig. 5.5P is shown the actual possible maximum performance that could be obtained with this collection. Achieving this performance would imply that for each question all the relevant documents were retrieved before any non-relevant documents were retrieved. In Fig. 5.4P the lines radiating from the point of origin have been based on the document output cut-off for thi[OCRerr] particular test situation, but the performance curves could be drawn on a polar coordinate graph with the lines radiating at regular intervals as in Fig. 5.6P. The original purpose of using this type of graph was to investigate the possibility that