MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Indexes Generated by Machine-Automatic Derivative Indexing
chapter
Mary Elizabeth Stevens
National Bureau of Standards
To some extent, however, the use of human editors to improve the product of KWIC
type indexing defeats the initial purpose of a quick and purely clerical or mechanical
process. Thus, Dowell and Marshall argue:
"...The basic permuted-title index can be substantially improved by editing and re-
writing the titles before they are submitted to the computer. . . . But this of course,
destroys the great advantage claimed for the permuted title index, `that it is a
purely clerical process'. Intellectual effort has entered the picture again and we
are back where we started.' 1/
In the extreme case, the re-introduction of intellectual effort is in effect the re-introduc-
tion of conventional human indexing, with the machine's role limited to that of compilation,
as in the case of the "notation-of-content" statements prepared for NASA's STAR
System (Slamecka and Zunde, 1963 [561]; Newbaker and Savage, 1963 [430]).
Kennedy suggests instead, therefore, that the augmentation might be accomplished by
the authors themselves. However, it may then be pointed out, as by Bernier and Crane,
for example, that the supplementation of titles before publication in order to provide
suitable additional indexing words would be "awkward, space[OCRerr]consuming and difficult".
They continue:
"It would call for the attention of index experts at the manuscript stage, which would
delay publication and expand the total indexing effort. Furthermore, good, thorough
indexes are based on the full information of abstracts and papers, not on their titles
only." 2/
An alternative method for title augmentation to improve the quality of KWIC indexing
is therefore to establish procedures for machine selection of significant words from more
of the text than just the titles alone. In fact, Luhn himself did not limit his technique as
originally proposed to titles only but indicated that the process could be performed at
various levels: title, abstract, or full text. 3/ In the 1958 permuted index to the ICSI
preprints, entries were derived from titles, author's names, author affiliations, headings
withinthe paper, figure and table captions, and sentences and phrases taken directly
from text. [OCRerr]4I Combinations of human and machine procedures based on sentences and
phrases selected from text are described by Herner who cites a two-fold advantage:
"First, it is not wholly dependent on the informativeness or lack of informativeness of
titles and bibliographic citations, and, second, it affords a greater depth of analysis than
is generally possible where titles or bibliographic descriptions alone are used." 5/
1/
21
3'
4/
5/
Dowell and Marshall, 1962 [159], p. 324-325.
Bernier and Crane, 1962 [56], p.117.
Luhn 1959 [381], p. 289.
Citron, et al, 1958 [120], p. i.
Herner, 1963[264], pp. 1-2.
69