MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Indexes Generated by Machine-Automatic Derivative Indexing chapter Mary Elizabeth Stevens National Bureau of Standards Such obvious positional clues as occurrences of words in titles, chapter or section headings, figure captions, have already been mentioned. To these can be added first and last sentences of paragraphs, 1/ or of first and last paragraphs as such. 2/ Wyllys observes that other criteria which are detectable in the text by straightforward machine procedures can be based on such features as italicization, capitalization, or punctuation. He notes, however, that such "editorial" criteria vary from journal to journal so that their usefulness would need to be related to the particular practices of individual journals. 3/ Somewhat more difficult for machine implementation, but certainly feasible in the present state of the programming art, is the use of specific semantic or syntactic clues. Here again, Luhn, Baxendale, and Edmundson and Wyllys all anticipate their critics and later investigators. Luhn recognized the fact that in at least some applications the characterization of documents by isolated words alone would fail to provide an effective degree of discrimination. He, therefore, suggested operations to establish word relationships, whether based on co-occurrences or combinations of specific parts of speech. 4/ Baxendale clearly uses both syntactic and semantic clues, detectable by built-in table lookups. Representative suggestions by Edmundson or Wyllys or both as co-authors include the following: We have in mind a glossary or dictionary of perhaps one to two thousand words that act either as cue words which signal the importance of a sentence or as stigma words that signal the insignificance of a sentence for purposes of abstracting." 5/ 1/ See, for example, Wyllys, 1963 [653], p.27: "One of the first published studies in automatic document-content analysis, that of Miss Phyllis Baxendale, brought out the importance of the first and last sentences in a paragraph as bearers of a good deal of the content of the paragraph." See also Marthaler, 1863 [399], p.25. 2/ 3/ 4' 5/ Compare Swanson, 1963 [580], p. 1: ". . Some evidence exists to show that for short homogeneous articles title and first paragraph are nearly as good as full text. Wyllys, 1963[653], p.28. Luhn, 1959 [384], p. 5. Edmundson,1962 [178], p. 11. 85