MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Other Potentially Related Research
chapter
Mary Elizabeth Stevens
National Bureau of Standards
possible definition involves the special case of devices or techniques which display or use
prior associations and co-occurrences or words, indexing terms, and related documents
to provide a guide or suggestive indexing and search-prescription-formulation or
renegotiation aid.
The idea of a mechanized authority list, following the restrictive first definition,
has been proposed by a number of investigators 1/ and has actually been used in computer
programs as discussed for example oySchultz and Shepherd (1960 [532]),Shepherd (1963
[545]) and Artandi (1963 [20]). It is the second definition of thesaurus with which we
shall be principally concerned. It is, as we have said, close to the conventional idea of
such athesaurus as Roget's. It is based on the hypothesis that patterns of co-occurrences
of words in a new item or in a search request can be compared with patterns of prior co-
occurrences, as given by a thesaurus "head", in order to expand, clarify, or pin-point
1'meaning" and thus provide a more effective indication of the true subject content. The
third definition will be considered as falling within the more general scope of statistical
association techniques, although as Giuliano points out, "a retrieval system embodying
an automatic thesaurus thus qualifies as being `associative'." 2/
The application of a thesaurus-like approach to indexing and searching problems is
again an area in which Luhn is one of the earliest proponents. In January 1953, he
proposed a new method of recording and searching information in which a special diction-
ary would be compiled for use in broadening the terms of a search request and in
normalizing word usage as between various indexers (recorders) and searchers. Al-
though he did not then use the term "Thesaurus" as such, he said in part:
"The process of broadening the concept involves the compilation of a dictionary
wherein key terms of desired broadness may be found to replace unduly specific
terms, the latter being treated as synonyms of a higher order than ordinarily
1/
See, for example, "Summary of discussions, Area 5," IGSI, 1959 [578],
p. 1263: "Two further complications arise from a mechanical index.
Some articles might deserve as an indexing term a word not contained
in the article. By an authority list, the product of the mechanized indexing
procedure might have such additional words added to it. Again, an article
might use a particular word but the vocabulary of the system might prefer
another one. This also can be handled by a mechanized authority list:'.
2/
Giuliano and Jones, 1962 [229], p. 4.