ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Operating Instructions for the SMART Text Processing and Document Retrieval System chapter M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. ii-26 Within the natural lan[OCRerr][OCRerr] documents pure requests nuist be submitted first, requests which are also texts next, and texts last. Each document is preceded by an identif[OCRerr]ing card containing a * in column 1, a four character control word in cols 2-5, a blank in column 6, and a twelve character identifier in cols 7-18. ConmLents and identification are punched in 19-72, as desired. The control word is FIND for a pure request, LIKE for a request that is also a text, and T[OCRerr]XT for a pure text. The document follows the identif[OCRerr]ring card (the card with the asterisk in column 1) and continues on as many cards as are needed. Basically, input text to S[OCRerr]T resembles typescript. For example, text may be punched anywhere in columns 1-72 of any nun[OCRerr]ber of cards. Any nuziber of consecutive blanks are equivalent to one blank. A blank is assumed between column 72 of one card and column 1 of the next card. The major differences from typescript are as follows: a) An asterisk (*) in column 1, or two consecutive dollar signs anywhere ($$) are end-of-text signals. These indicators should not be used unless an end-of-text occurs. Normally, the end of a text is indicated only by the * in column 1 of the control card beginning the next text. b) Periods not preceded by blanks are taken to be parts of abbreviations. Thus, a period meant to indicate the end of a sentence should be preceded by a blank. c) S[OCRerr]T provides for the inclusion of up to 355 characters of identification for each text. This is used in the ANSWER LONG output option (see part 3.6). This identification should be punched at the beginning of the text on cards with a single $ in column 1. If a text begins with cards that contain a $