MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Appendix B: Progress and Prospects in Mechanized Indexing
appendix
Mary Elizabeth Stevens
National Bureau of Standards
The term mechanized indexing can be interpreted in two different ways: as involving
the use of machines to produce indexes once the index entries have been pre-determined
manually, or as involving the use of machines to select the index entries as well as to
prepare the indexes.
The first interpretation, that of machine compilation of indexes is perhaps best
represented by the progressively more sophisticated mechanization used for the production
of Index Medicus from manual "shingling", through sequential card camera operations, to
the computer-based system using a high-speed phototypesetter, the Photon GRACE 1, 2/.
As noted elsewhere in this report, machine capabilities have made practical the prepara-
tion of citation indexes. In general, however, machine-compiled indexes work with the
results of human intellectual efforts as applied in the subject content analysis of documents.
We also find machines used to provide aids to the indexer. Two different tools may be
employed to improve the quality of indexing. There are prescriptive aids in the sense of
limiting and rigorously defining- the scope of index terms to be used, and there are
suggestive aids in the sense of provoking ideas about additional terms that might be used.
The first type may involve a mechanized authority list or thesaurus used to normalize
proposed index term entries, as has been demonstrated by Schultz 3/ and Schultz and
Shepherd 4/ from 1960 onward. The potential value of this technique is indicated by further
investigations of Schultz et al 5/ in which it was found that index terms proposed by authors
agreed more with terms employed by more than one member of a typical user group than
did terms available in the document titles. Another example of developments in the use of
a mechanized thesaurus is the system at Lockheed Missiles and Space Division, Palo
Alto 6/.
This type of tool is used to check proposed indexing terms against the terms of the
system vocabulary, to prescribe choices between synonyms and different levels of spec-
ificity, and to supply syndectic devices such as see also" references. Computer
manipulations of thesauri can also be used to diversify search questions and to provide
useful groupings of terms previously used in the system. The mechanized thesaurus can
thus serve as the second type of aid by suggesting to the human indexer additional terms he
might use. In effect, such a thesaurus provides a display of prior term-term, document-
term and document-document associations observed in a particular collection, such as was
demonstrated in the form of special purpose equipment in Taube's 11EDIAC11 7/ and the
11ACORN" devices at A. D. Little 8/.
The associational thesaurus can also be used to aid in the resolution of ambiguities of
natural language and to provide for updating in the light of changing terminologies or
changes in the subject scope of a collection. What are the prospects for automatic updating
and revision of a mechanized thesaurus? Luhn 9/ has suggested that a record of the num-
ber of times words and groups are looked up would be 11an indispensable part of the system
for making periodic adjustments based on the usage of words or notions as mechanically
established.
Another suggestion for the development of mechanized aids in human indexing proce-
dures has been made by Markus 10/. This is to "explore the possibility of applying
programmed teaching to indexing, with or without machines.
Machine-compiled indexes rest upon the efficacy of human indexing and there is
increasing reason to doubt that this will be 11good enough'1 for the future. It appears that
there is a growing consensus with respect to inadequacies of present scope and coverage
of indexing services. Cheydleur Ll/ emphasizes that: "The cost of manual classification
and abstracting of all the articles in the world's hundred-thousand technical periodicals
would be fantastic. The practicality of carrying it out in a coordinated and timely way by
223