CRANV2 Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2 Citation indexing and bibliographic coupling chapter Cyril Cleverdon Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. - 251 - FIGURE 7.12T Index Language Citation Indexing and Bibliographic Coupling (Weighted) Documents Relevance 1-4 Number of Documents in Collection Number of Questions 42 (Subset 2) Number of Relevant Documents 198 Generality Number 3.4 1400 Weighted Coupling Strength 150+ 81 -150 51 -80 31-50 21 -30 16-20 11 -15 6-10 3-5 1-2 Documents Retrieved Rel. Non-tel. 131 5495* 122 1758 111 1387 90 854 67 432 51 207 36 126 23 59 12 2O 2 1 Recall Ratio 66.1% 61.6% 56.0% 45.5% 33.6% 25.7% 19.1% 11.6% 6.0% 1. o% Precision Ratio 2.3%* 6.5% 7.4% 9.5% 13.4% 19.7% 23.1% 26.0% 37.5% 66.7% Fallout Ratio 9.377%* 3.000% 2.367% 1.457% 0.737% 0.353% 0.221% O.lO1% 0.034% 0.002% final figure of relevant and non-relevant documents retrieved, but replaces the groups formed at the various coupling levels as given in earlier totals with new groups based on the weighted scores. The results on the 42 aerodynamic questions are shown in Fig. 7.12T; although different groups are formed, there appears to be little variation from the performance for the same doCument/question set presented in Fig. 7,2T. As sta{ed in the opening chapter of this volume, we have considerable reservations in presenting these results, in particular when it comes to attempting to make comparison with the performance obtained by conventional methods. One thing that can be stated positively is that the same inverse relationship exists; bibliographic coupling is a precision device which has very much the same effect as coordination in a conventional system. Since approximately 12% of the documents did not contain any references, it was inevitable that the maximum recall ratio should fall well short of 100%. In the event it appears that, with this collection, something around 70% recall might be expected; for any recall ratio lower than this, the performance appears to compare quite favourably with conventional indexing,