MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Problems of Evaluation chapter Mary Elizabeth Stevens National Bureau of Standards However, the question of the objectives of the system brings us back full circle to the questions of purpose in terms of particular requirements, of quality, and of how to measure either purpose or quality. Thus we may determine that an automatic indexing procedure produces a product at least as rapidly, at least as inexpensively, at least as consistently as human indexing operations'would, and with substantiallyless investment of manpower resources. However, will this product be as useful or as "good" as the human product? 1/ In view of the many caveats about the present quality of indexing systems- and the lack of standards for measuring quality, z/ it is important to recognize that we should compare the products of automatic indexing methods "not with hand-crafted excellence, but with the average, the routine output of the over-burdened subject analyst working with the deficiencies of any other indexing system". 3/ Such deficiences include the critical question of how well and how consistently the system, whatever it is, is applied in practice by the human analysts. 7.3 Findings with Respect to Inter -Indexer and Intra-Indexer Consistency Very few objective studies, despite the obvious relationship to the general questions of quality, pertinency, and reliability of indexing, have as yet been made of inter-indexer and intra-indexer consistency. Perhaps the first investigation both to obtain experimental data and to analyze the observed types of failures to achieve correct assignments was that of Lilley. 4/ He took the answers made to 6 questions by 340 students entering a graduate library school, wherein they were asked to write down the subject headings which they would expect to be applied to other books on the same subject as 6 "sample books" in a system such as the Library of Congress card catalog. Lilley reports: 1/ See, for example, in addition to comments by [OCRerr] and others previouslyquoted, Helyar, 1961 [Z6z], p. 110: "The general current of feeling of the meeting as re- flected both in the papers and in the discussion is that the standard of indexing is not nearly adequate;" Artandi, 1963 [zz], p. 1.: "... `Good indexing' as such has not been defined satisfactorily and is the function of many variables, some known, others not yet identified"; Tritschler, 1963 [610), p. 5: "... `Good'indexing is ex- tremely difficult to describe and `perfect' indexing is impossible to define or measure." z/ See Cleverdon, 1960 [1z4], p. 4Z9: "The most important requirement in information retrieval is a recognized standard of measurement and after that we need a satis- factory method of measuring. Only when these have been found will it be possible to know for certain whether any new system of indexing or retrieving information is an improvement on previous methods. At present all those trying to solve the problems of information retrieval are working very much in the dark, uncertain as to the real problems and quite unable to apply any measurements to their proposed solutions." 3/ 4/ Kennedy, 196Z [311], p. 1Z6. Lilley, 1954 [360]: See also Vickery, 1960 [6z6], p. 157 4.