MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Automatic Indexing
chapter
Mary Elizabeth Stevens
National Bureau of Standards
AUTOMATIC INDEXING
A State-of-the-Art Report
Mary Elizabeth Stevens
A state-of-the-art survey of automatic indexing systems
and experiments has been conducted by the Research Informa-
tion Center and Advisory Service on Information Processing,
Information Technology Division, Institute for Applied Tech-
nology, National Bureau of Standards. Consideration is first
given to indexes compiled by or with the aid of machines,
including citation indexes. Automatic derivative indexing is
exemplified by key[OCRerr]word[OCRerr]in[OCRerr]context (KWIC) and other word-
in-context techniques. Advantages, disadvantages, and possi-
bilities for modification and improvement are discussed.
Experiments in automatic assignment indexing are summarized.
Related research efforts in such areas as automatic classifi-
cation and categorization, computer use of thesaur[OCRerr], statistical
association techniques, and linguistic data processing are
described. A major question is that of evaluation, particularly
in view of evidence of human inter-indexer inconsistency. It
is concluded that indexes based on words extracted from text
are practical for many purposes today, and that automatic
assignment indexing and classification experiments show
promise for future progress.
1. INTRODUCTION
This report of the Research Information Center and Advisory Service on Information
Processing (RICASIP) !/ is one of a series intended as contributions to improved co-
operation in the fields of information selecti n systems development, information re-
trieval research and mechanized translation. In each of these areas, automatic tech-
niques for linguistic data processing are receiving increased attention. This report
covers a state-of-the-art survey of current progress in linguistic data processing as
related to the possibilities of automatic mechanized indexing. Insofar as has been
practical, the survey of the literature on which this report is based has been made
through February 1964.
It has concentrated on the major developments in and related demonstrations of auto-
matic indexing potentialities. Examples are also given of indexes compiled by machine
and of potentially related research efforts in such areas as natural language text search
mg, statistical association techniques used for search and retrieval, and proposed
systems for concept processing. There are, undoubtedly, various omissions. Neither
the inclusion of reports on various specific experiments and techniques nor the omission
of others is intended to reflect an endorsement as such of those that are included or an
adverse evaluation of those that are not mentioned.
1/
Initiated at the instigation of the National Science Foundation. RICASIP is jointly
supported by NSF and NBS.
1