MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Other Potentially Related Research chapter Mary Elizabeth Stevens National Bureau of Standards 1/ construction and revision of a mechanized thesaurus, as again Luhn has suggested. - Schultz suggests that machine records should be maintained of what thesaurus terms are actually used for indexing and searching, the frequencies of term usage, the co- occurrences, the number of items described by particular combinations of terms and the like 2/ The potential combinations of natural text processing, automatic indexing, and thesaurus construction and updating are stressed in many current programs. For example, Eldridge and Dennis discuss: "Indexing by machine from natural text in a fully automatic system, in which statistical analysis of the words is employed as a device for (a) building auto- matically a `concept' thesaurus, (b) indexing incoming documents with reference to the thesaurus, and (c) continuously revising the thesaurus to reflect new word usages in currently incoming documents." Similarly, Giuliano and Jones suggest that given a term-term statistical association matrix, a transformation can be arrived at with a unit vector assigning value only to index term Z that ranks every other index term according to degree of association with Z, then by listing the higher ranked terms for each term Z, "a `thesaurus' listing can be obtained completely automatically." 44 6.2 Statistical Association Techniques A special definition of the word "thesaurus" might, as we have noted, include the development of devices and techniques which either automatically or by man-machine inter- action serve to suggest the amplification of a set of index terms. We shall briefly con- sider here both devices that visually display associations between words, terms, and documents 3/ and techniques for machine use of coefficients of correlation for prior co- occurrences in a collection of word-word, word-term, term-term, term-document, and document-document associations, the statistical association factor technique as first developed by Stiles. 1/ Luhn, 1957 [385], p. 316: "Provision should be made to register the number of times each word is looked up in the index and the number of times each family number has been used for encoding. Such a record would be an indispensable part of the system for making periodic adjustments based on the usage of words or notions as mechanically established." 2/ 3/ 4/ 5/ Schultz, 1962[529], p. 104. Eldridge and Dennis, 1962 [183], p. 6. Giulianoandjones, 1962[229], p. 12. It should be noted that Tabledex, the Scan-Column Index, and similar tools pro- vide to some extent a display of prior associations between index terms. (See pp. 25-27 of this report.) Thus Cheydleur (1963 [113], p.58) rerriarks: "Ledley. has focussed on inter-item concepts in designing his economical TABLEDEX arrangement for displaying the connectivity of index terms and related file items." [OCRerr]I8