MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Indexes Generated by Machine-Automatic Derivative Indexing chapter Mary Elizabeth Stevens National Bureau of Standards Taking more text as the basis for automatic derivative indexing adds, of course, the problems and costs of keystroking additional input material. At the same time, most of the major problems of scatter of references, synonymity, redundancy and exclusive reliance on the author's own language and terminology not only remain but may quite probably be intensified. The problems of establishing suitatle rules for selection of significant words are aggravated, not only by the far larger number of different words to be processed, but because of unresolved problems in effectively relating length of index and depth of indexing to the length of the document. 1/ There are, however, a number of practical suggestions by which machine augmenta- tion of titles might be accomplished. First is the invariant selection of words that are capitalized, other than those that begin a sentence. As Wyllys points out, this type of selection criterion would emphasize proper names, and these in turn might be particularly valuable clues, especially in a military intelligence situation. 3/ It has also been suggested that the selection criteria should depend on particular pre-specified contexts, such as being preceded by the words: "the results were...,", "in conclusion ...", and the like. A second type of machine selection procedure is the converse of the exclusion or stop list, namely, an inclusion list or dictionary which may involve especially significant words for a particular subject matter area or words that are of importance to a particular organization. In the discussions of the Area 5 ICSI papers it was remarked: "Another complication is that mechanized indexing finds in a paper what was important to the author. What happens if there is something in the paper not important to the author but of importance to the indexer? One possibility is to have a list of words and phrases expressing the interests of a particular collection, which the machine looks for in the papers. If this word or phrase occurs even once, it should be picked up as an indexing term." 4/ 1/ 2I 3/ 4/ See, for example, Wyllys, 1963 [653], p.22. See Luhn, 1959 [371], p. 52; [384], p. 8. Wyllys, 1963 [653], p.15. [OCRerr]ee Ref. [578], p. [OCRerr]263. See also, among others, Luhn, l9[OCRerr]9 [37[OCRerr]] , p. 52: "Just as common words have been eliminated by look-up in a special index, certain essential words may be looked up in another special index for the purpose of listing them under any circumstances". 70