ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Operating Instructions for the SMART Text Processing and Document Retrieval System chapter M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. II-3~ taming ZZZZZZ in columns 1-6. The suffix list may appear in any order, provided that the first suffix begins with `e". If the thesaurus is introduced from cards (control word BOTH or THE[OCRerr]), it follows the suffix list (if any) immediately. If there is no suffix list, it follows the first data card. The thesaurus may contain any number of words. Each word is punched on a separate card. Since a suffixing routine is available, only the word stems should be entered in the dictio- nary. The suffixing routine will accept imiltiple suffixes, and also recognizes that words may double their final consonant, change a final `y" to an `1i", or drop a final 11e11 before adding a suffix [7]. This permits a highlY accurate lookup in which almost all derivatives of most English words can be identified automatically [12]. of course, the extremely variant forms (such as ttmice", "geese1, etc.) must be entered in the dictio- nary as separate words. It is sometimes necessary to enter complete forms of words to preserve important distinctions. For example, `1programmer" is readily identified from "program" + t1m11 + "er", but if it is desired to distinguish a progrannmer from a program, both forms should be entered. The dictionary lookup procedure may always be used as a full paradigm dictionary by entering every form of a word in the thesaurus, at the expense of a great deal of keypunching. Words entered explicitly take precedence in the lookup over possible stem-and-suffix combinations. Each stem to be entered in the dictionary is punched left-adjusted in columns 1-24 of a card, with trailing blanks. With each word are associated between one and six concept numbers. These are punched right-adjusted in five column fields, starting in column 25 and extending to 54. The first concept nuzriber is punched in