ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Operating Instructions for the SMART Text Processing and Document Retrieval System
chapter
M. E. Lesk
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
11-59
on [OCRerr], only the set on A6 is to be processed.
end of file
This marks the end of the job for the monitor system. Total cards
submitted: BCD 12, binary 2, total i[OCRerr]. [OCRerr]st of the production runs made
with SM[OCRerr]T are of this size.
8. Miscellaneous
SM[OCRerr]T is frequently changed and revised; thus writeups may differ.
The current writeup supersedes the ones included in reports ISR-7 and
ISR-8 completely, and it supersedes report ISR-9 wherever the present
writeup differs from the one in ISR-9. S[OCRerr]RT is written in Fortran fl
and in FAP. With all its subroutines, the programs include approximately
30,000 source cards and 5,000 binary cards.
8.1. Size Limits
The thesaurus may consist of any number of English stems; however,
only 32,000 significant concepts are allowed.
The program can address up to 250,000 documents; however, the capacity
of the intermediate tapes will be exceeded before this point. Assuming
that M[OCRerr]CTAP is used to prepare the input document tape, about 25,000-100,000
documents (depending on their length) is a probable practical limit. If
one insists on submitting documents on cards, about 1500-2000 documents is
all will fit. The only truly significant limitation may be expected to be
the fact that only fifty requests can be processed in one c[OCRerr]z[OCRerr]uter run.
8.2. Timing
The following timing estimates are approximations derived from our