MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Indexes Generated by Machine-Automatic Derivative Indexing
chapter
Mary Elizabeth Stevens
National Bureau of Standards
3.2 Modified Derivative Indexing
Some of the more obvious of the disadvantages of KWIC indexing techniques can be
reduced if not eliminated by a variety of human and machine procedures. These include
augmentation of titles to provide additional clues to subject aspects, manual post-editing,
and synonym reduction through such devices as thesaurus lookups.
The ink was scarcely dry on the first issues of a KWIC index before a number of
suggestions for improvements, modifications, and augmentations were proffered in the
literature. In fact, both Luhn and Baxendale considered various possible refinements in
their original proposals. The first systematic review of work in the field of automatic
extracting- -whether to produce indexes or abstracts, or both- -was made by Edmundson
and Wyllys in 1961[(181]. They covered not only the KWIC type indexes as such, but also
modifications suggested by Baxendale, Luhn, Oswald and others, and they themselves
advanced a number of additional possibilities. Of the various modifications and refine-
ments that have been suggested, the most obvious is that of title augmentation.
3.2.1 Title Augmentation
The machine-prepared index that was probably the first to go into productive opera-
tion is actually one involving title and subject indicators rather than pure keyword-from-
title permutations. The CIA project, beginning in 1952, is based upon manual pre-
editing of the titles themselves, with the words to be picked up as index entries being
underlined. In addition, it involves assignment of other words, descriptors or terms
from a hierarchical classification schedule to indicate additional access points (Veilleux,
1961 [624].
In later KWIC type indexing, the possibilities of improving effectiveness by pre-
editing or post-editing to modify and expand titles have been suggested and explored by a
number of investigators. The semi-automatic indexing reported by Janaske adds
descriptive words or phrases in parentheses at the end of titles and uses them as
additional indexing points (Janaske, 1962 [299]). At Biological Abstracts Service,
improvements have been obtained (without sacrifice in the speed desired in order to index
5,000 abstracts twice a month) by title supplementation as well as by an improved stop
list and by post-editing word divisions and word recombinations. 1/ Titles for each of
two 12, 000-item bibliographies in the field of radiobiology are reported as being edited
considerably before KWIC type processing. 2/ Other examples3?f modified derivative
indexing based on title augmentation include Chemical Patents -, the Applied Physics
Letters indexing project at Oak Ridge National Laboratory, which provides for an author-
prepared form to describe features of property and method not covered in the title, 4/
and the KWIC Index to Neurochemistry ([420]).
1/
2/
3/
4/
Parkins, 1963, [466], p.27.
Davis, 1963 [lS0i,p.238.
See Markus, 1962 [394], p. 19, and ref. [662].
Connolly, 1963 [136], p.35.
68