MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Indexes Generated by Machine-Automatic Derivative Indexing
chapter
Mary Elizabeth Stevens
National Bureau of Standards
It is also of interest to note, moreover, that the very existence of machine-generated
permuted title indexes should greatly increase the likelihood that authors will use better
and more useful titles. 1/ At a seminar on word and vocabulary byproducts of permuted
title index[OCRerr]ng held at Biological Abstracts headquarters on October 8, 1962, Rigby of
Meteorological and Geoastrophysical Abstracts reported informally that as of that time
there was already discernible improvement in titles covered by their KWIC index. In the
same year (1962), Tukey similarly stated that: "Chemjcal Titles has been heavily enough
used to affect the construction of titles of papers on chemical subjects. " 2/ Instructions
to authors of the previously mentioned 11Short Papers" 3/ for the A. D. I. 1963 Annual
Meeting specified that at least six significant words should be included in their titles and
nearly all authors did in fact comply. Two of the "Short Papers" are specifically directed
to the topic of improvements that authors can make in writing their titles (Brandenberg,
1963 [80]; Kennedy, 1963 [312]).
Instructions of this type can be effectively used for situations where all authors are
under the same administrative control, as in the internal reports prepared in a single
organization. This type of situation, incidentally, is one for which KWIC proponents are
often most enthusiastic (Kennedy, 1962 [310]; Black, 1962 [65]; Linder, 1960 [362]).
Finally, there is considerable promise that pressures brought to bear by journal editors
of the publications of professional societies, notably the American Institute of Chemical
Engineers and other cooperating member societies of the Engineers Joint Council, will
result in improved adequacy of titles and thereby increased effectiveness of title word
indexes.
Certain other disadvantages of KWIC indexing techniques, however, relate specif-
ically to operational problems and requirements in the machine production of these indexes.
There is, first, the problem of the amount of context that is usually displayed--that is, the
question of line length--and the related problems of title truncation and wrap-around. As
Kennedy notes: "Progressive shifting of the title to bring a given word to the indexing
column frequently causes portions of the title to exceed the line space available, first at
the right margin, then the left, or even both simultaneously. " 4/ A case in point is the
perhaps apocryphal "EROTIC TENDENCIES AMONG TRAPPIST MONKS" where
"ATHEBOSCL" had been dropped off at the left.
For multi-column KWIC indexes, in particular, where the line length is typically
58-60 characters, "much of the relevance is lost because the reader sees the wrong slice
of the title". 5/ The Bell Laboratories KWIC index, 6/ Chemical-Biological Activities, 7/
1/
2/
Tukey, 1962 [611], pp. 9-10.
3/
Luhn, 1963 [376] and [377].
4/
5/
6/
7/
See for example, Black, 1962 [65],
Kennedy, 1961 [311], p. 117.
Brandenberg, 1963 [80], p. 57.
Kennedy, 1961 [311], p. 118.
Figures 4 and 5.
p. 317; Youden, 1963 [658], p. 332.
63