MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Automatic Assignment Indexing Techniques chapter Mary Elizabeth Stevens National Bureau of Standards terms previously assigned to the five most related! documents, where `[OCRerr]relatedness1 is a function of the similarity in citation patterns as between the new document and items al- ready in the collection. The results of such index term assignments are reported as identical to those made by human judgment approximately 50 percent of the time. 1/ More specifically, in an experiment using documents drawn from a small collection in the fields of mathematical linguistics and machine translation, a new item was com- pared in terms of its citation data with the citation similarity data previously determined for earlier documents, and the set of five related documents was selected using the magnitude of the row similarity coefficients obtained from links of length one and two. All index terms occurring at least twice in the set of terms assigned to these related items were then assigned to tne new items. For the ten I!typi[OCRerr]alI! new item cases, for which comparative data are shown, the citation data assignment method correctly 2/ assigned, on average, 47.6 percent of the terms assigned manually to the same items. - A slightly more sophisticated indexing term assignment formula, described by Lesk, was applied to additional test cases, but I!failed to raise accuracy above fifty percent". 3/ For five typical new cases, the improved method correctly assigned 11 of the 20 terms manually assigned to these items, or an average accuracy of 55.5 percent 4/ 4.7 Similarities and Distinctions among Assignment Indexing Experiments. In Table 2 some of the key points of the various automatic assignment indexing experiments we have discussed above are summarized. Certain similarities, distinctions, and differences are to be noted. Borko and Bernick use the same corpus as did Maron and also re-apply Maron's formula to a different clue-word set for the same material. Williams uses material similar to the Maron-Borko computer corpus. The SADSACT tests also use some items that might be included in the Maron-Borko and Williams corpora. The Swanson experiments with newspaper clippings represent a quite different dass of material consisting of brief, terse, factual messages. 1/ 2/ 3/ 4/ Lesk, 1963 [357], p. V-8. Salton, 1962 [520], p. 111-41, Table 9 Lesk 1963E357], p.V-7. Ibid, p. V-8, Table 3. 100