MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Operational Considerations
chapter
Mary Elizabeth Stevens
National Bureau of Standards
"The expense of transcribing such documents in their entirety will be justifiable to
a limited extent only and it may, therefore, be assumed that automatic processing
will be mainly applied to future literature." 1/
"As long as we are limited to using the equipment that is available now, the pre-
paration of data for input will be an expensive procedure and a major cost factor in
automatic processing of natural language." z/
..... In a discussion of indexing by machine, we must recognize the preparation of
input to the system as the major item of cost of operation." 3/
"Present inability to read documents automatically would make it necessary to punch
cards or tapes, an operation likely to be even more expensive than reading by
humans." 4/
In addition to the high costs of manual retranscription, it is also noted that keypunching
"tends to undermine the purpose of natural text retrieval by requiring human effort at the
input end of the process." 5/
In particular, keypunching or keystroking requirements undermine the purposes of
rapid indexing as well as filing for retrieval by virtue of the time required to transcribe
text. Horty and Walsh report, for example:
"Flexowriter operators can produce between 1400 and 1800 lines per day of statutory
text. Keypunch operators used in previous experiments could punch approximately
100 lines per hour of alphabetic materials, but could not maintain this rate for a
sustained period of time. " 6/
Thus, until such time as more versatile character recognition equipment is available,
even some of the most ardent advocates of full text processing are forced to the use of
considerably less than full text for other than research purposes. Swanson comments,
for example:
"... One must note that the manual recording of text may be exorbitantly expensive.
If so, a judicious selection process may permit a reasonable compromise between
the expense of input and the depth of indexing which results. For example, it is
reasonable to select the title, abstract, table of contents (if any), sub-headings, and
key sentences or paragraphs." 7/
1/ Luhn, 1959 [384], p. Z.
Ray, 1961 [496], p. 51.
3/ Howerton, 1961 [z8z], p. 3Z7.
4/ Levery, 1963 [359], p. Z35.
5/ Doyle, 1959 [168], p. Z.
6/ Horty and Walsh, 1963 [z80], p. Z59.
7/ Swanson, 1963 [580], p. 1.
167