MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Indexes Generated by Machine-Automatic Derivative Indexing chapter Mary Elizabeth Stevens National Bureau of Standards line, sentence number and other reference identifications. After re-processing against a stop list of common words, all other words in the edited text are selected as candidate index entries, these are then sorted into alphabetical order with subsequent printout giving each word occurrence followed by the entire sentence which contained it and the page and other location identifications. This computer output is then post-edited manually not only to eliminate trivial entries but also to normalize terms and phrases used. 3.2. 3Modified Derivative Indexing - Baxendale's Experiments As has been previously noted in the introduction to this report, the name of Phyllis Baxendale together with that of H. P. Luhn is generally accorded credit for pioneering efforts in the entire area of automatic indexing.Baxendale in particular is generally credited with the first actual experiments in modified derivative indexing. In investiga- tion beginning in the late 1950's, she has explored not only statistical approaches to automatic selection of index terms (based for example on word frequencies) but also the use of word pairs, word groups, contextual associations, and in particular the subject- indicating clues of prepositional phrases (Baxendale, 1958 [41], 1961 [40], 1962 [42]; Becker, 1960[44]; Edmundson and Wyllys, 1961 [181]). Baxendale began by considering the patterns of scanning that humans typically use to select `1topic" sentences, phrases and words, and she then proceeded to simulate by computer program the selection of phrases consisting primarily of nouns and modifiers. In her first experiments, (1958 [41]) she used two methods of automatic selection. In the first procedure, words serving the grammatical functions of pronoun, article, auxiliary verb, conjunction and the like, were deleted by stop list lookup. Frequency count statistics were then derived for the remaining words. In her second procedure, the computer was programmed to select prepositional phrases from text and to use the four words succeeding the preposition as index entries unless an additional preposition or a punctuation mark is first encountered. In later experiments, Baxendale has explored possible grammatical models "which would select all and only nouns or adjective-noun combinations". 1/ Taking as an initial corpus a sample of document titles, rules were devised to reject for human analysis titles with question-marks and the like, to eliminate numeric information and single symbols, and to segment the title into its component clauses and phrases by the detection of commas, periods, and similar clues. By list lookup, certain words are identified as capable of serving the syntactic functions of being quantifiers, prepositions, or clause introducers. Special subscripts are then assigned to these words and the subscripts are examined by machine to provide further segmentation; to delete quantifiers, auxiliary verbs, or words ending in "ed" or "mg" and preceded by an auxiliary verb, and to deter- mine relationship functions between the remaining, presumably substantive, words. Still other work by Baxendale has been directed toward the development of frequency of co-occurrence or textual association of candidate indexing terms. She reports as follows: 1/ Baxendale, 1961 [40], p. 209. 73