MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Operational Considerations chapter Mary Elizabeth Stevens National Bureau of Standards 8. OPE[OCRerr][OCRerr]TIONAL CONSIDERATIONS Whatever the verdict of evaluation of one or more automatic indexing techniques, whether of the derivative, modified derivative, or assignment type, there are certain operational considerations and problems that typically affect any attempt to apply such techniques in actual production operations. These considerations, which also affect lin- guistic data processing operations in general, include input considerations, availability of methods or devices for converting text to machine-usable form, programming consider- ations, questions of format and content of output, and problems of customer acceptance of the machine products. 8.1 Questions of input Input considerations include, first, questions of the extent and availability of mate- rial which can be handled directly by the machine. This may be limited to title only, to title plus abstract, title plus other material, 1/ preselected text or automatically gener- ated extracts; or it may in a few cases extend to full running text. Possible future re- quirements may extend to the processing not only of full text but of interspersed graphic material (equations, charts, diagrams, drawings, photographs) as well. We have considered typical arguments for and against the limitation of input to titles only, to augmented titles, and to abstracts in other sections of this report. The points to be emphasized here are requirements for pre-editing or post-editing, provisions for error detection and error correction, the time and cost requirements of conversion equipment if material is not already available in machine-usable form, and the like. As Cornelius suggests: "Present day computers, if used for machine indexing, will be generaUy input limited and will require excessive data preparation. Causes of these limitations are: time required for translation to machine language, verification of this ma- chine language, and the capability or lack of capability of correction in the input media." 2/ Examples of pre-editing requirements, even for the simple case of keyword-in- title indexing, include the spelling out of chemical symbols, the encoding or the omission of subscripts and superscripts, insertions of hyphens to prevent indexing of a word, and substitutions of blanks for hyphens in compound words to assure indexing of each com- ponent. 3/ For full text, a far more extensive and elaborate set of rules and conventions must be developed and applied. 4/ Other editing may be required for format standard- 1/ This may specifically include cited titles, as suggested variously by Bohnert, 1962 [69], p. 19; Giuliano and Jones, 1962 [229], p. 10; Swanson, 1963 [580], p. 1; Gallagher and Toomey, 1963 [205], p. 53; and as used in the SADSACT method, see pp. 98-99 0£ this report. 2/ Cornelius, 1962 [140], p. 42. 3/ See, for example, Kennedy, 1961 [311], p. 120. 4/ See, for example the sophisticated proposals of Nugent, 1959 [441], and Newrnan et al, 1960 [439]. 164