ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Information Analysis and Dictionary Construction
chapter
G. Salton
M. E. Lesk
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
`v-i
IV Information Analysis and Dictionary Construction
G. Salton and M. E. Lesk
1. Introduction
At the base of any information system must always be a system of
information analysis, used to decide what a given information item, or
a given search request is all about. In a conventional library system,
this analysis may be performed by a human agent who uses established
classification schedules to decide what category, or categories, will
most reasonably fit a given item. In certain other well known indexing
systems, keywords or index terms may be manually assigned to documents
and search requests, to be used for the identification of information
content.
Regardless of what type of analysis is performed, and in particular
regardless of whether the analysis is done manually or automatically,
it is necessary to start with a set of carefully prepared instructions
specifying the allowable steps, and setting forth in detail the meanings
and implications of choosing one or another of the permissible alterna-
tives. These instructions often take the form of dictionaries of various
types, listing the allowable information identifiers, and giving for each
a definition which regularizes and controls its use. As will be seen,
such dictionaries may take a variety of forms, including almost al[OCRerr][OCRerr]ys
so-called "see" references which provide linl[OCRerr]s for entries to be
replaced by other preferred terms, and "see also't references which
designate cross-references applicable to the dictionary items. Ne[OCRerr]tive