MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Indexes Compiled by Machine chapter Mary Elizabeth Stevens National Bureau of Standards Kessler himself and his associates have also conducted some experiments in comparative evaluation of indexing aids derived from citation data on the one hand and from conventional subject indexing on the other. The basis for evaluation was a total of 334 papers published in The Physical Review in 1958. The study involved detailed comparison of the ways in which these papers fell into related groups according to the "analytic subject index" used by the journal's editors and according to the method of "bibliographic coupling". The essentials of the latter method are described as follows: "a. A single item of reference used by two papers is called one unit of coupling between them. "b. A number of papers constitute a related group GA, if each member of the group has at least one coupling unit to a given test paper[OCRerr]P0 "c. The coupling strength between P[OCRerr] and any member Of GA is measured by the number of coupling units (n) between them ` 1/ For the 334 papers, 73 categories of the Analytic Subject Index (ASI) had been used. For the bibliographic coupling method, each of the papers was in turn considered as the test paper and groups were formed for any of the 333 other papers that shared one or more citations with it. In general, it was concluded that there was good correlation between the groupings of papers achieved by the two methods. It should be noted, how- ever, that 44 papers fell into no groups at all on the basis of the bibliographic coupling criterion. 2/ Salton and associates at the Harvard Computation Laboratory are also concerned with the citation indexing principle as a possible basis for grouping similar documents. They are also concerned with evaluation of results so obtained by comparison with document groups obtained by subject indexing means. In the comparative experiments, data were first compiled for a closed document set of 62 items as to similarities with respect to both "citedness" and "citingness". The same items were manually indexed and similarity coefficients between these items were derived from overlappings of assigned index terms. When the two measures of similarity were compared with each other and with document associations obtained by random assignments of "citations" and "terms", the conclusions reached were as follows: "The similarity coefficients obtained by comparing overlapping citations for a sample document collection with overlapping, manually generated index terms are much larger than those obtained by assuming a random assignment of citations and terms to the documents; relatively large similarity coefficients are generated for nearly all documents which exhibit at least a minimum number of citations; little seems to be gained by using citation links of length greater than two; for early documents, citedness furnishes a better indication than the amount of citing, and vice versa for recent documents; for documents which can both cite and be cited, equally good indications seem to be obtained by comparing citing and cited documents." 3/ 1/ 2/ 3/ Kessler, 1963 [32o[OCRerr], p.1, footnote. Ibid, p. 5. Salton, 1962 [szo[OCRerr], p. 111-42. 36