MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Automatic Indexing chapter Mary Elizabeth Stevens National Bureau of Standards AUTOMATIC INDEXING A State-of-the-Art Report Mary Elizabeth Stevens A state-of-the-art survey of automatic indexing systems and experiments has been conducted by the Research Informa- tion Center and Advisory Service on Information Processing, Information Technology Division, Institute for Applied Tech- nology, National Bureau of Standards. Consideration is first given to indexes compiled by or with the aid of machines, including citation indexes. Automatic derivative indexing is exemplified by key[OCRerr]word[OCRerr]in[OCRerr]context (KWIC) and other word- in-context techniques. Advantages, disadvantages, and possi- bilities for modification and improvement are discussed. Experiments in automatic assignment indexing are summarized. Related research efforts in such areas as automatic classifi- cation and categorization, computer use of thesaur[OCRerr], statistical association techniques, and linguistic data processing are described. A major question is that of evaluation, particularly in view of evidence of human inter-indexer inconsistency. It is concluded that indexes based on words extracted from text are practical for many purposes today, and that automatic assignment indexing and classification experiments show promise for future progress. 1. INTRODUCTION This report of the Research Information Center and Advisory Service on Information Processing (RICASIP) !/ is one of a series intended as contributions to improved co- operation in the fields of information selecti n systems development, information re- trieval research and mechanized translation. In each of these areas, automatic tech- niques for linguistic data processing are receiving increased attention. This report covers a state-of-the-art survey of current progress in linguistic data processing as related to the possibilities of automatic mechanized indexing. Insofar as has been practical, the survey of the literature on which this report is based has been made through February 1964. It has concentrated on the major developments in and related demonstrations of auto- matic indexing potentialities. Examples are also given of indexes compiled by machine and of potentially related research efforts in such areas as natural language text search mg, statistical association techniques used for search and retrieval, and proposed systems for concept processing. There are, undoubtedly, various omissions. Neither the inclusion of reports on various specific experiments and techniques nor the omission of others is intended to reflect an endorsement as such of those that are included or an adverse evaluation of those that are not mentioned. 1/ Initiated at the instigation of the National Science Foundation. RICASIP is jointly supported by NSF and NBS. 1