MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Appendix B: Progress and Prospects in Mechanized Indexing appendix Mary Elizabeth Stevens National Bureau of Standards We may thus conclude that the progress and prospects of automatic indexing3 as of September 1966, are both provocative and challenging. They are `provocative because so much in terms of both practical and theoretical accomplishment has already been dem- onstrated, and "challenging" because so much remains to be done. Further, what remains to be done will in all probability require serious, intensive, and imaginative investigations of a wide variety of questions from the relative usage and acceptability of a KWIC index through possible changes in author and editor practices to the fundamental questions of semantics and human judgment. Nevertheless, when the results of automatic classification or automatic indexing procedures reach levels of 70 percent or better mean agreement either with human in- dexers or with potential users evaluating the relevance of items retrieved by such indexing, then the machine methods should be preferred to routine, run-of-the-mill, manual indexing wherever the costs are at least commensurate. The technical feasibility of achieving such performance levels for a relatively small number of classification categories or a relatively small vocabulary of index terms has already been demonstrated experimentally. There remain unresolved questions of the extent to which it will be possible to apply such techniques to the larger vocabulary require ments and the practical operating considerations in actual collections. Assuming that we can solve these problems, however, many advantages will accrue. First is the speed with which many items can be indexed --- in a few minutes or hours at most for, say, 10, 000 items. Secondly, there are advantages of timeliness and the ease with which an entire collection can be re-indexed or re-classified. A third advantage is the consistency of the machine procedures, especially as compared with the inconsistency to be noted in available data on tests of comparative performance among indexers. The advantage of ability to re-index quickly, easily, and inexpensively (because most input costs will have been incurred previously) is of major importance in terms of over- coming present barriers to the introduction of improvements in operating systems (since, as Kyle L'/ points out, "The most common reason for not trying new and/or improved techniques of classification and indexing is the difficulty of reclassifying and re-indexing large collections ?Y) and in terms of dynamic revision and up-dating (as Borko 37/ emphasizes). Another advantage, particularly of methods using teaching samples is (as suggested by Mooers as early as 1959 52/), the capability for making assignments of indexing terms in, say, an English language system to items whose texts are written in other languages: French, German, or Russian. This type of advantage can point the way to greater interna- tional collaboration in indexing and document control procedures. A further possibility is suggested by the convergence of automatic indexing techniques based upon teaching samples with adaptive selective dissemination systems and client feed- back possibilities, especially those involving `more-hke-this requests. If we assume a large-scale, multiple-access system with adequate personalized files for the typical client, the common data bank of document identificatory and selection criteria, condensed rep- resentations, and full text (if available) can be selectively accessed by him on the basis of automatic indexing generated by his own choice of selection criteria and his own choice of exemplar items for each such criterion. 231