MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Indexes Generated by Machine-Automatic Derivative Indexing
chapter
Mary Elizabeth Stevens
National Bureau of Standards
cssenti4ly the original Luhn format, and it should be noted in this connection that while
Luhn recognized that the origin of the KWIC principle lay in the making of concordances,
he claimed in particular the use of machines to achieve speed, completeness, and accu-
racy, and a novel format. 1/
The most common variant to the center position for the indexing window (or keyword
position) is at the left or the beginning of the line. Netherwood's selected bibliography of
logical machine design, which is probably the first of the modern permuted title indexes
to appear in the open literature, used the left-most positions for the index entry word in
each title listing. Slant marks were also printed to show the breaks in the normal order
of the title (Netherwood, 1958 L437]) A proposed subscription service, advertized in
1958 but never actually brought into operation, would also have used the left-hand
position.
In these left position examples, the keyword-in-context principle is kept only
partially intact since the word in the index position is directly adjacent to its most
specific right-hand context, not to its left-hand. In variations such as developed at
Stanford Research Institute, however, the index word is extracted from its context and
printed separately in the left-hand margin, with the title in its normal order printed to
the right. This type of variation has been called "KWOC", for keyword-out-of-context,
and is illustrated in Figure 6, which shows the format developed by C. E.I.R., Inc. for
the OTS index to U.S. Government Research Reports.
Table 1 lists a number of KWIC index projects for which computer programs are or
might be made available to interested additional users. Computer programs have been
written specifically for the IBM 650, 704, 1620, 709, 7090, and 7094 data processing
systems, the G.E. 225 computer, the Deuce Computer in England, the UNIVAC 1103 and
1107 systems, and the Japanese computer JEIPAC, among others. In addition, some
permuted title indexes are produced manually, or with the use of simple business office
machine equipment. For example, an index to the MBS Bulletin for 1951-1961 has been
so produced by the American Institute of Biological Sciences. 3/
1/
Private communication, excerpt of letter from H.P. Luhn to C. L. Bernier,
December 27, 1960: "With respect to the origin of the KWIC Index, you are,
of course, right that it is a form of concordance, as stated in my original
paper. Furthermore, keyword indexing has been practiced in various forms
as far back as a hundred years ago. All of these methods were, however, de-
pendent on manual effort. I would say that the significance of the present KWIC
Index is based on the fact that it is produced automatically by machine, affording
speed of compilation, accuracy and completeness. As far as the particular format
of the Index is concerned, this is novel to my knowledge, in accordance with in-
formation I have been able to ascertain from others."
2/
"PILOT--a permutation index to this month's literature", see p.8 and Figure 1.
A left-most window full-title format was developed at Stanford University in co-
operation with the IBM San Jose Laboratories. It has been applied by the Com-
putation Center to the titles of computer pro[OCRerr]rams for the benefit of users of the
Program Library Computation Center, Stanford University, "The KWIC Index",
1963. See also Marckworth, 1961 L393]
National Science Foundation's CR&D Report No. 11, L430], p.10; Janaske, 1962
[299[OCRerr]; Shilling, 1963 Lsso] and [55[OCRerr]J
48
3/