MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Other Potentially Related Research chapter Mary Elizabeth Stevens National Bureau of Standards "Automatic indexing, based on the relative frequency of words used in a document, produces a partial vocabulary of the content words used to express its subject. Retrieval can then be accomplished by expanding the request vocabulary... This method tends to overcome the deficiencies and inconsistencies inherent in the use of terms derived automatically from a text. " 1/ Conversely, Stiles also points out the possibility that the results of automatic derivative indexing procedures, extracting indexing words from the documents directly, might prove a more realistic or reliable basis for the development of his word co-occurrence correla- tion data than do the Uniterms assigned by human indexers. 2/ The work of Stiles has also stressed the importance of two factors that may well be critical for the improvement of automatic indexing techniques. These are, namely, the consensus of prior human indexing and the consensus of subject coverage of a particular collection. 3/ In his experimental investigations, Stiles began with an existing collection of approx- imately 100, 000 items which had previously been indexed, over a period of time, with a Uniterm indexing vocabulary consisting of about 15, 000 terms. The objective of the experiments was to determine how, given a specific search request, a more effective "net to catch documents" 4/ could be generated and how the responding items might be ranked in order of their probable relevance to the request. The statistics of co-occurrence of terms used to index the same documents were first obtained. A modified chi-square formula was then applied to determine relative fre- quencies of use of co-occurring terms. 5/ Patterns of term co-occurrence could then be derived in the sense of term[OCRerr]profiles which show, for each term, the more significant of its associational values of pairing with other terms in the collection. The actual procedure for using these term[OCRerr]profiles in search prescription formulation and in document selection involves several steps, generally as follows: 6/ 1/ 2/ 3/ 4/ 5/ Stiles, 1962 [573], pp. 12-13. Stiles, 1961 [572], p. 205. Stiles, 1962 [573], p. 6 and 1961 [572], pp. 273, 277. Stiles, 1961 [572], p. 192. In general, we shall not be concerned with the precise mathematical formulations. It is to be noted that in a recent report Giuliano and his colleagues have reviewed a number of the various mathematical formulas proposed in the literature for the computation of word, term, and document associations, including those of Parker- Rhodes and Needham, Maron and Kuhns, Stiles, Salton, Osgood, Bennett and Spiegel (Giuliano et al, 1963 [230], Appendix I). 6/ Stiles, 1961 [571], pp. 273-275. 120