MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Automatic Indexing
chapter
Mary Elizabeth Stevens
National Bureau of Standards
1.3 Derivative vs. Assignment Indexing
At least part of the provocation and controversy with respect to the possibilities for
the use of machines in indexing is due to confusion as to what type of indexing is meant.
This in turn relates to a much older and broader controversy--that between "word" or
"catchword" indexing on the one hand and "subject indexing", "concept indexing", or
"controlled indexing" on the other.
In terms of operational definition, the contrast is best expressed in Luhn's dis-
tinction between index entries that are derived from the text of an item itself and those
that are assigned to it from a list or schedule of subject categories, descriptors and the
like, which exists independently of the text of the item (Luhn, 1962 [372]). [OCRerr]lI In general,
the differentiations that are made for the broader controversy, and the claims and
counter-claims made by the enthusiasts of either school, provide background for the
distinctions that should be made between various automatic derivative indexing operations
and whatever possibilities may be demonstrated for assignment indexing by machine.
In his text on information storage and retrieval Kent (1962 [315) ) contrasts word index-
ing as used in permuted keyword indexes, concordances and "pure" uniterm systems with
controlled indexing which "implies a careful selection of terminology used in indexes in
order to avoid, as far as possible, the scattering of related subjects under different
headings." He notes elsewhere that word indexing requires little subject-matter training
on the part of the indexer and little skill in indexing as such, and adds: "It is this type of
indexing that a machine can perform well"21
Like Kent, Bernier thinks that true subject or assignment indexing requires highly
trained human indexers. He says further:
"The difference between subject and word indexing has been unclear at times.
Both types employ words, but only true subject indexing employs them with
discrimination. Word indexing leads to omission of entries, scattering of re-
lated information, and a flood of unnecessary entries. Word indexing uses
words as they are found in the material indexed with a minimum regard for
standardized meaning..." 3/
Herner provides a further amplification of differences that are pertinent to con-
sideration of indexing by machine, as follows:
1/
2/
3'
4'
See also Herner, 1962 [266], p.5; Skaggs and Spangler, 1963, [557], p. 60; Slamecka,
1963 [558], p.224. Mooers makes a similar distinction between "index terms which
are words or phrases extracted from the text and stylized conceptual terms--cliches
--which are assigned to the text", 1963 [423], p.4.
Kent, 1962[314], p.268.
Bernier, 1956[54], p.23.
Herner, 1963 [267], p. 183.
13