MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Automatic Classification and Categorization
chapter
Mary Elizabeth Stevens
National Bureau of Standards
The proposed "Multilindex" system is also based on micro-thesauri or small
vocabularies designed, by human analysis, for clue-indications to a relatively narrow
subject field, together with potential syntactic-semantic role indications built into the
dictionary, again by extensive human analysis, following the approaches previously taken
by A. L. (Lukjanow) Loewenthal in her suggestions for solutions to problems of mecha-
nized translation. An unpublished proposal-type brochure describing the system was
available as of December 1963.!' As of that date, also, demonstration printouts were
available from an IBM 1401 [OCRerr]ortran program, illustrating an index compiled from
abstract-text input and a 1,200-word dictionary for documents in the field of space an-
tenna tracking radar. [OCRerr]/ A repetoire of 350 1'concepts" or indexing terms was involved,
with an average of 10 assigned to 22 test documents, many of these assigned terms being
identical to words occurring in either the title or the text of the abstract of the item.
Slamecka and Zunde have investigated the extent to which the "notations-of-content"
in the system developed by Documentation, Inc. for NASA's STAR might be derived by
machine techniques from the text of the abstracts with enough normalization-standardi-
zation via inclusion dictionary lookup to qualify as an assignment indexing technique.
These workers claim:
"This preliminary investigation indicat[OCRerr]s the possibility of using the computer
to index documents adequately for machine retrieval by matching their abstracts
against an authoritative subject-heading authority . .. The inconsistency inherent in
human indexing can be eliminated as the number of terms derived from any one
abstract will always be the same. The abstract and its automatically derived set of
index terms will always be equivalent. . . "3/
A final example of other approaches to automatic assignment indexing research, not
yet reported in the open literature, is an NIH sponsored project at Goodyear Aerospace, in
cooperation with the Universities of Minnesota and Rochester and Western Reserve
University, looking toward an automatic classification procedure based on word coocur-
rences for a s[OCRerr]t consiting of [OCRerr]00 four-to-five page documents in the field of diabetes
literature. Programs for statistical analyses of the full text of these documents, all of
which have previously been processed for the manual W. R. U. "telegraphic" abstracting
system, are being developed. 4/
5. AUTOMATIC CLASSIFICATION AND CAT EGORI ZAT ION
In all the experimental work, to date, that has been directed toward the use of
computers and other machine -like techniques for the automatic indexing of documents, a
1/
2/
3/
4/
"Description of MULTILINDEX. A mechanized system for indexing documents,
storing information, retrieving information", P.S. Shane, Dec. 4, 1963, In-
formation Systems, Inc., 7720 Wisconsin Avenue, Bethesda, Maryland.
Private communications, A.L. Loewenthal and P.S. Shane, Dec. 11, 1963.
Slamecka and Zunde, 1963, [561], pp. 139-140.
F. Tuttle, private communication, Oct. 30, 1963.
106