MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Indexes Compiled by Machine
chapter
Mary Elizabeth Stevens
National Bureau of Standards
2.5 Machine Conversion From One Index Set to Another
A final possibility in the general area of machine compilation of indexes and machine
use to improve the availability of indexes is as yet in a highly speculative stage. This is
the possibility of converting from one index set to another by machine look-up procedures.
In the Welch Medical Library project, mentioned earlier, use was made of punched card
techniques to convert from one index arrangement to another, 1/ but machine-
recognizable identifiers for both arrangements were explicitly encoded in the material.
In recent studies at Datatrol, however, preliminary investigations have been conducted
looking toward machine lookup of index-term equivalence tables in order to convert, for
example, DDC descriptors to corresponding subject headings used in the AEC vocabulary.
Hammond and Rosenborg (1962 [250] and [252]) report on the compilation of a uni-
lateral table of "indexing equivalents" between approximately 7, 000 DDC descriptors and
those AEC subject headings judged by them to be identical, synonymous, or "usefully"
equivalent, such as one or the other being subsumed by a broader or more generic term.
Findings showed 23.8% of the terms of the DDC vocabulary presumably identical to those
of AEC, 38.1% of lower generic level, 7.4% of higher generic level, and 10.9% for which
no useful equivalents could be found. A sample table of indexing equivalents was prepared
for DDC-to-AEC conversion, but not in the opposite direction.
Since, in general, convertibility of indexing vocabularies would be desirable
wherever duplication of cataloging and indexing effort is likely to occur (that is, where
two or more different documentation organizations receive at least some of the same
material as inputs to their systems), the results of these preliminary studies are pro-
vocative and appear to merit the further study that is being sponsored by an Interagency
Task Group on Vocabulary Study of the Committee on Scientific Information, under the
Federal Council for Science and Technology.
There are many substantial difficulties, however. When applied to actual indexing
of the same items by the two agencies, it was found that for 277 items indexed by both
AEC and DDC (then ASTIA):
"ASTIA used a total of 2, 571 descriptors, and AEC 840 subject headings... of
these, 392, or roughly half of the AFC terms, were either completely or, for
all practical purpose, identical."
Painter (1963 [460]) made further studies of equivalency in her investigations of
duplication and consistency of subject indexing at several Government agencies. For ZOO
items indexed by both AEC and DDC, she found 20% DDC equivalency, 67% AEC equiva-
lency, and 30% similarity of actual indexing. She concludes, in part:
"In considering these solutions and the statistics revealed by the studies it should
be concluded that with a maximum of only 69 percent equivalency, or convertibility,
and a minimum of 28 percent, there is still a large proportion of terms which will
1/
2/
Garfield, 1959 [221], p. 471.
Hammond 1962[250], p. 4.
38