MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Problems of Evaluation chapter Mary Elizabeth Stevens National Bureau of Standards However, some of the findings are pertinent to our present questions of evaluation. Thus, of 492 items selected by Documentation, Inc. , that ASTIA considered pertinent but had not selected, 98 were missed by them although the proper subject heading was searched and the catalog card had adequate selection clues, 89 were missed because not all applicable subject headings were searched, 21 were missed because the original subject heading assignments had been inadequate, 7 were missed because neither title nor abstract provided indication that the report itself was pertinent to the request, and 102 were missed "because the subject heading did not occur to the searcher or because there were so many cards under the subject heading that the searcher was discouraged" 1/ Similarly, Gull reports, of 318 items selected by ASTIA that Documentation, Inc. personnel considered relevant but had not themselves selected, 97 were missed because the searcher did not consult the proper terms. 7.2.1 The Cranfield Project The inauguration of the Cranfield project is itself indicative of a prior lack of objective standards as applied to the measurement of effectiveness of information indexing, selection and retrieval systems. 2/ Beginning in 1957, and still continuing with respect to individual indexing devices such as synonym controls and role indicators, this work has attempted to compare different indexing systems (e.g. , UDC, Uniterm, etc.) under different indexing conditions (e.g., type of training of indexer, length of time allowed to index) against proposed measures of 1'retrieval effectiveness". These measures are, respectively, the recall ratio, or the percentage of relevant documents retrieved as against the total number of relevant documents known to be in the collection, and the relevance ratio, or the percentage of relevant documents among those actually retrieved. In the first Cranfield tests, on 18, 000 documents, it is reported that the recall ratio ranged between 75 and 85 percent for all four indexing systems. 3/ These results are 1/ 2/ Gull, 1956 E2461, p. 329. Compare, for example, Randall, 1962 E492], pp. 380-381: "Prior to 1957, the proponents of the various indexing and classification schemes, the universal decimal system, the alphabetic subject heading, the Uniterm system and faceted classification touted their own system on the bases of subjective evaluation and theoretical investigations. There were many claims and much supposition about the relative merits and benefits . .. but there was no body of data from which an objective evaluation could be made. . . Many observers believe that the Cranfield study constitutes the most important work done in the field of cataloging in recent times." 3/ Cleverdon, et al, 1964 [l30[OCRerr], p. 87. 150