IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Test Environment chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 1-14 following types of dictionaries have been tested in retrieval runs: 1. Suffix `5' only, in which request and document words are matched as they stand, with only the terminal `5' denoting plurals being removed. See Section VI. 2. Stems (Null dictionary), in which matching is based on word stems as identified by an automatic suffix removal procedure. See Section VI. 3. Thesaurus, where words (mainly stems) are grouped to- gether on the basis of synonymy1 or partial synonymy, using human judgment normally. See Section VII. 4. Statistical association (Concon), where synonyms or related words are identified automatically by using cooccurrence frequency of words in the collection. Apart from the control parameters which may be varied, no human judgment is used. See Section IX. 5. Hierarchies, where subject notions are arranged in a series of subordinate relations, such as genera and species, whole and part. Hierarchies tested so far use thesaurus groups, and texts include some of the many possible strategies of using hierarchies such as going 11up1t in th[OCRerr] hierarchy to parents, or going "down1' to sons. See Section VII. 6. Phrases, in which recognition of pairs and larger sets of words is achieved. Phrases are used in conjunction with thesaurus groups and phrase recognition takes place when words from the required thesaurus groups occur within one sentence of the document or request. See Section VII.