MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Indexes Generated by Machine-Automatic Derivative Indexing
chapter
Mary Elizabeth Stevens
National Bureau of Standards
maintenance of truly current awareness. For example, Herner in his 1961 review of the
state- of-the - art of organizing information says:
T11 am told that the American Chemical Society has never had a more successful
basic science publication. The key to the whole thing is, I believe, the extreme
currency of Chemical Titles. This in turn derives from the speed and simplicity
of the KWIC process." 1/
Conrad reports as follows:
"Reception of B. A. S.I. C. ... has been so extremely enthusiastic . . . that we
are excited by the possibilities of producing permuted title indexes in one or
more additional languages. The creation of a B.A. S. I.C. index in any language
requires only that the titles be translated and punched on cards. Alphabetical
arrangement, permutation and `type-setting' iscom[OCRerr]1etely automated and, for
5, 000 titles takes only two hours to accomplish. " 2/
3.1.1 Applications of KWIC Indexing Techniques
The KWIC type process is indeed simple and straightforward. The words of the
author's title are prepared for input to the computer by keystroking either to punched
cards or to punched paper tape. Mter being read by the computer, the text of a title is
normally processed against a "stop list" to eliminate from further processing the more
common words, such as "the", "and", prepositions, and the like, and words so general
as to be insignificant for indexing purposes, such as, "demonstration", "typical'1,
measurements", "steps", and the like. The remaining presumably "significant" or
"key" words are then, in effect, taken one at a time to an indexing position or window,
where they are sorted in alphabetical order. The result is a listing of each such word
together with its surrounding context, out to the limit of the line or lines permitted in a
given format. As each keyword is processed, the title itself is moved over so that the
next keyword occupies the indexing position, and this process is repeated until the entire
title has thus been cyclically permuted.
A number of formats are available in which the length of the line, the position of
the indexing window, and the extent of "wrap-around" (bringing the end of a title in at the
beginning of a line to fill space that would otherwise be left blank) are major variables.
Current examples of KWIC type indexing output are shown in Figures 2 through 7.
Usually, the indexing window is located at or near the center of the line with several
extra spaces to the immediate left or with other devices such as the shading of
B. A. 5.1. C. to aid the searcher in scanning down the keywords listed. This is
1/
2/
Herner, 1962, E266], p.10.
Conrad, 1962 E137], p. 378A.
41