ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Information Analysis and Dictionary Construction chapter G. Salton M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. `v-i IV Information Analysis and Dictionary Construction G. Salton and M. E. Lesk 1. Introduction At the base of any information system must always be a system of information analysis, used to decide what a given information item, or a given search request is all about. In a conventional library system, this analysis may be performed by a human agent who uses established classification schedules to decide what category, or categories, will most reasonably fit a given item. In certain other well known indexing systems, keywords or index terms may be manually assigned to documents and search requests, to be used for the identification of information content. Regardless of what type of analysis is performed, and in particular regardless of whether the analysis is done manually or automatically, it is necessary to start with a set of carefully prepared instructions specifying the allowable steps, and setting forth in detail the meanings and implications of choosing one or another of the permissible alterna- tives. These instructions often take the form of dictionaries of various types, listing the allowable information identifiers, and giving for each a definition which regularizes and controls its use. As will be seen, such dictionaries may take a variety of forms, including almost al[OCRerr][OCRerr]ys so-called "see" references which provide linl[OCRerr]s for entries to be replaced by other preferred terms, and "see also't references which designate cross-references applicable to the dictionary items. Ne[OCRerr]tive