MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Problems of Evaluation
chapter
Mary Elizabeth Stevens
National Bureau of Standards
7.1 Core Problems
First and foremost of the core problems implicit in the question of evaluation of any
indexing scheme, whether applied by man, machine, or man-machine combinations, are
those of interpersonal communication itself, which in turn relate to fundamental problems
of epistemology. These are, first, the problems of language as a means of com -
municating perceptions, apperceptions of relationships between present observations and
prior experience, and value judgments based thereon, and, secondly, even more funda-
mentally, the question and the veridicality of language representations of real transactions
and events. Serious investigators in the field, including many who have themselves con-
tributed to automatic indexing techniques, have made such typical acknowledgments of the
difficulties as the following:
"The imprecision connected with discussion of retrieval effectiveness and of
relevance is not due to lack of understanding of the relatively straightforward
retrieval processes, but is due to our lack of basic understanding about language,
meaning and human communication itself."
"Fundamentally, the study of inquiry procedures is a problem in the general
psychology of cognitive functioning Relevant problems concern the way
problems are recognized and formulated into questions, the way a search plan
is developed to find answers to questions, and finally, the way it is decided
whether or not a possible answer matches the specifications of a question."
A second core problem is the heterogeneous and somewhat arbitrary development of
natural languages themselves. It is much the same fundamental problem whether men or
machines are to read text and determine the. "meaning" (at least, in the sense of com-
munication intent) of messages expressed in a natural language. However, the problems
are aggravated if men themselves must know enough about language and its conveyances
of message content to specify precisely to a machine what it is to look for and to use.
Salton enumerates some of these difficulties as follows:
"No well-defined set of rules is known by which the individual words in the
language are combined into meaningful word groups or sentences. Specifically,
the correct identification of the meaning of word groups depends at least in part
on the proper recognition of syntactic and semantic ambiguities, on the correct
interpretation of homographs, on the recognition of semantic equivalences, on
the detection of word relations, and on a general awareness of the background
and environment of a given utterance." 3/
1/
2/
3/
Giuliano, 1963[230], p . 6.
Stone, 1962 E576], p. 1.
Salton, 1963[519], p. 1-2.
145