MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Automatic Assignment Indexing Techniques
chapter
Mary Elizabeth Stevens
National Bureau of Standards
Swanson and his TRW associates have further proposed extensions of the prespecified
unique clue-word technique. For example, it is suggested that machine processes of
comparing words of titles, subtitles and chapter headings to lists of possible subject
heading can be extended in sophistication by machine lookups of synonym groups and of
characteristic subject-word associations. 1/ Frequency weightings may be taken into
account, and similar measures of association and subj[OCRerr][OCRerr]t-indicativeness may be
developed for phrases as well as for individual words. - In general, however, the
apparent success of this clue-word technique in tests to date should be considered in the
light of the special character of the items, their extreme brevity, and the high probability
that the fact-word incidence involved in news reporting is not typical of less popular and
less factually oriented materials 3/
Continuing work along similar lines has been carried forward at Ramo-Wooldridge in
the `1Word Correlation and Automatic Indexing Program!! sponsored by the Council on
Library Resources (1959 [490] and [491]). Here, the objectives are to develop and apply
clue-word techniques to material that is much more representative of the scientific and
technical literature. The thesaurus[OCRerr]groups, now called `1indexonym" groups, are made up
of words and phrases selected by extensive human analysis as being significantly "useful-
for- retrieval[OCRerr]purposes".
New items would be processed in a word and phrase lookup operation, with each word
or phrase being initially assigned the identifier number codes of all groups to which it
belongs. However, unless a particular group 5 number is repeated several times within
the space of a few paragraphs, it is not used as the basis for the actual assignment of an
index tag. Provision would be made for calling human attention to items having a numbe
of words that are not deleted by processing against a "useless-for-retrieval purposes"
list, but that are not found in any of "accepted" groups. It is suggested that in this way it
should be possible to "ascribe measures of automatically recognizable `newness' to
technical articles!!. 4/
4.2 Maron's Automatic Indexing Experiments
By April of 1959, the reports of work at Thompson Ramo-Wooldridge on automatic
indexing and related problems submitted for the Current Research and Development in
Scientific Documentation series included reference to Maron and a "probabilistic model for
the assignment of index tags", as well as to Swanson's continuing projects. 5/
1/
2/
3/
4/
5/
Swanson, 1962 [584], p. 469.
Swanson, 1963[580], pp. 1-2.
See also Mooers, 1963 [424].
Thompson Ramo Wooldridge, 1959 [491], p. 2A.
National Science Foundation's CR&D report No. 5 [430], p.34.
93