MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Indexes Generated by Machine-Automatic Derivative Indexing
chapter
Mary Elizabeth Stevens
National Bureau of Standards
3. INDEXES GENERATED BY MACHINE- -AUTOMATIC DERIVATIVE INDEXING
We have noted, in the earlier statement of the scope of this survey, a distinction
between "derivative" and "assignment" indexing. This distinction is related directly to
the question: ?[OCRerr]Is what can be done by machine properly termed `abstracting', `indexing',
or classifying'?'1 It relates also, as we have remarked, to a continuing controversy far
older than any question of the introduction of machine techniques--that between "word"
and "concept" indexing, between "uniterms" if selected directly from the text and
"descriptors'1 in the sense of their being1indexin[OCRerr] terms selected so as to have "a care-
fully specified meaning for retxieval", to say nothing of contrasts with subject heading
schemes and classification schedules.
Some of the major arguments pro and con derivative (usually word) and assignment
(usually concept) indexing will be considered in a subsequent section of this report on the
problems of evaluating indexing methods. Nevertheless, the present popularity of
automatic derivative indexes of the KWIC type, while subject to all the disadvantages
typically cited for all purely derivative indexing systems, does show the actuality of
automatic indexing potentialities and may in fact hold the promise of solving some of the
present-day problems of subject control.
In this section, we shall consider first the straightforward word extraction tech-
niques used in KWIC type indexes. Possibilities for modified derivative indexing by
title augmentation, manipulation of word groups and use of special clues in keyword
selection are then discussed, including work by Baxendale, Luhn, and Artandi. Related
research and developments efforts work in automatic abstracting which lend themselves
to derivation of indexing terms includes proposals and experiments by Luhn, Oswald,
Edmundson, Wyllys, Doyle, and Lesk and Storm, among others. Some comments will
be given on the quality of modified derivative indexing by machine. Automatic derivative
indexing at the time of search, as in the natural language text searching systems of
Swanson, Maron, Kuhns, and Ray, and Eldridge and Dennis, will be discussed in a later
section of this report.
3.1 KWIC Indexes
The development of computer-generated permuted-title keyword indexes, especially
in the issuances of Chemical Titles and B. A. S.I. C. (Biological Abstracts-Subjects-In
Context) has been hailed by some as "the miracle of the decade"[OCRerr]and "the greatest th7ng
to happen in chemistry since the invention of the test tube". The major reason for
the optimistic enthusiasm is the speed with which the computer can produce can produce
a complete index to some specific set of books, documents or papers so that publication
and dissemination of the index can be prompt and thus serve as an important tool in
1/
2I
3/
Mooers, 1963 L423], p 3.
See pp. 132-136.
Quoted by D. R. Baker statement in "U.S. Congress, Senate Committee on
Government Operations", 1960 [619], p. [OCRerr]69.
40