ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Operating Instructions for the SMART Text Processing and Document Retrieval System chapter M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. II-5~ frequently used for auxiliary functions. These are described in the present section. 6.1. THES TEES is a program to fo[OCRerr]n null dictionaries from a collection of English text. It also prints out frequency counts and listings as a by-product. It includes a suffixing routine. TEES requires an A2 one control card. This card contains three integers. The first integer, punched right-adjusted in columns 1-5, specifies the maxizirurn number of concepts (words) to be included in the null dictionary. The second integer, punched right-adjusted in columns 6-10, specifies the minimum number of occurrences in the collection that any word in the dictionary may be expected to exhibit. These two numbers permit the user to control the size of the null dictionary. For a complete null dictio- nary, the first number should be very large and the second number shculd be 1. The third number is punched right-adjusted in columns 11-13 and specifies the tape on which the document collection is located. If this field is blank, tape 5 (the input tape) is assumed. The collection is [OCRerr]laced on the specified tape in normal SM[OCRerr]T format ([OCRerr].l), with documents preceded by *TEXT cards only. *FIND cards, and *LIKE cards should not be used. Of course, since no searches are made during thesaurus construction, the requests may be labeled *TEXT, without problems, if it is desired to include them in the counts. A *STOP card ends the collection. 6.2. [OCRerr]RVAL [OCRerr][OCRerr]VAL is a program to compute additional evaluation data for a set of