MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Indexes Compiled by Machine
chapter
Mary Elizabeth Stevens
National Bureau of Standards
In the Salton project, tests of the value of citation links for the assignment of index
terms have been made by comparing the citation pattern of an "unknown" document with
those of other documents in the collection to derive a set of five "related" documents,
where relatedness is decided on the basis of the magnitude of the similarity coefficients
for the citation links. Any index term that appears at least twice in the set of terms
previously assigned to the five related documents is then assigned to the new item. In
general, approximately 50% of the terms so assigned were also assigned to the same
`1new" items by human indexing procedures. 1/
As we have previously noted, however, the advantages of citation indexing are likely
to be most effectively applied when used as part of an array of other tools. Tukey
suggests, in particular, that permutation indexes of titles, as in KWIC systems, would be
of great value as "starter" and "re-check" mechanisms for the use of citation indexes.2[OCRerr]
Brownson reports:
"Consideration is now being given to the possibility of experimenting with a
`hybrid' type of index that would combine permuted titles, authors, and citation
data. Such an index might be more useful than any of the individual types of
indexes issued singly; and, since no human indexing judgment would be involved,
it could be prepared largely by machine and issued rapidly."
Williams, while at ITEK, proposed a hybrid integrated index combining listings by
authors, corporate authors or author affiliations, keywords-in-context frorn title, and
references to works cited by and to works citing an item, and she also developed a sample
4/
format for selected items from several journals in the field of philosophy. -
Precisely such a hybrid tool was provided with the Short Papers for the A. D. I.
Annual Meeting 1963, and it was indeed issued rapidly. A brief period of only two or
three weeks elapsed between receipt of many of the manuscripts and the distribution of
two automatically typeset volumes. The second of these volumes contains a KWIC and
an author index to these papers themselves, a bibliography and citation index to all
papers referenced by them, and KWIC and author indexes to the cited papers, all
computer-compiled within this time period. [OCRerr]/
1/
2I
3'
4/
5'
Ibid, See also Lesk 1963, E 3s7[OCRerr], p. V-8.
Tukey, 1962, L6ll[OCRerr], p. 12.
Brownson, 1963 [OCRerr]82], p. 4.
T. M. Williams, private communication, dated January 4, 1962.
Luhn, 1963 [376],
and [377] , pp. 353-38Z.
37