ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Operating Instructions for the SMART Text Processing and Document Retrieval System
chapter
M. E. Lesk
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
11-28
constructed accordingly; but the foregoing are simple equivalents for
the special characters.
The natuaal language documents zrnist be on SMART tape ATWO. Bulk
lookups from tape can be performed, however, by giving the specification
ATWO II (defining ATWO as physical A6). In this case no binary documents
can be submitted. Alternatively, ATWO can be defined as 12 (A7) or 13
(A8) in which case binary documents may still be submitted on A6 with a
DOCTAP specification. Tape ATWO should not be moved to A[OCRerr], AS, Bl, B2,
or B6; these tapes are used in the lookup.
[OCRerr].2. Binary Documents
The binary documents are submitted on A2 and/or A6 following the English
documents. If there are any binary documents on A2, they must be preceded
with a card containing [OCRerr]NLY in cols 1-5 and a blank in column 6.
As in the case of English documents, the pure requests are submitted
first, followed by requests which are also texts, and finally concluding
with the pure texts. The documents are preceded by *FIND, *LIKE, and
*T[OCRerr]XT cards exactly as in part [OCRerr].l. The *oNLY card suffices to distinguish
them from English documents. The binary cards which make up each document
contain the identification punched at the beginning of the text, and the
concept vectors obtained at lookup time. These documents are punched by
the PUNCH specification, after the lookup. Thus, when binary documents
only are being submitted, the library tape need not be mounted unless
hierarchical expansion is required. The binary documents may be used
with any title or phrase weighting (if STATPR was in effect at lookup
time), and with any expansion options. If it is desired to change the