CRANV1P1
ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text
General Considerations
chapter
Cyril Cleverdon
Jack Mills
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
To consider these five user criteria from the viewpoint of their evaluation, 'time'
and 'presentation' offer few problems, for both are mainly influenced by management
decisions concerning hardware. To find the time factor it is only necessary to
record the time lapse between the request and the receipt of the output for a statistically
valid number of cases. To evaluate the presentation, one has merely to observe
whether the user receives a list of document numbers, a list of bibliographical references,
a list of titles, a set of abstracts or a set of complete documents, either readable text
or microform. To evaluate the effort demanded of the user in obtaining an answer to
his query is only slightly more complex because of the possibility, in certain systems.
that the effort can vary from the minimum of expressing the query in natural language
to the maximum of conducting the complete search unaided in, for instance, a citation
index. However, in any single system, evaluation of this point appears only a
straightforward observation of a number of cases.
This only leaves recall and precision and the comment and the question by Bourne
can now be answered. The reason why so much attention has been given to recall
and precision is that these are the only two user criteria which demand any serious
intellectual effort in their measurement. They are concerned with whether the system
is capable of locating what is sought and are so fundamental that they can be said to
be on a different level to the other criteria. Whether they are "better" than any of
the other proposed criteria does not enter into the argument; it is certainly not
suggested that they are the criteria which are always uppermost in the mind of a user.
The unarguable fact, however, is that they are fundamental requirements of the users,
and it is quite unrealistic to try to measure how effectively a system or a subsystem
is operating without bringing in recall and precision.
Cranfield I had attempted, as its original objective, to establish the, at that
time, generally accepted hypothesis that there were significant differences in the
operational performance of various types of index languages, but this it had most
definitely failed to do. It had appeared to show that all four indexing languages
were operating at about the same level of recall performance; more positively, it
had shown, by the analysis of search failures, that the decisions by the indexers in
recognising significant concepts in the documents were far more important than any
variations in the structures of the various index languages. The test of the Western
Reserve University index appeared to indicate that there was an optimum level of
exhaustivity of indexing, for a higher level of exhaustivity did not significantly improve
recall but it weakened precision, while a low level of exhaustivity inhibited maximum
recall. In these matters, the index language appeared to play a relatively insignifi-
cant part, for these were intellectual decisions by the indexer and were made in
complete independence of the index language being used.
It was then realized that theoretically there was no reason why, given the same
concept indexing, there should be any difference in the performance of two index
languages. It was recognised that in practice the physical form of the index might
affect the operating efficiency - and still more, of course, the economic efficiency -
but theoretically there is a possibility of matching performance. To understand this,
it is necessary to consider the fundamental aspects of index languages.
It should be made quite clear that we are concerned with index languages only
in their theoretically perfect form; even in Cranfield I, we endeavoured to optimise
each index language that was being used. Although in this process nothing was done
which any person or organization using a particular index language could not equally
well have done, this did not prevent a number of people from sending in, critical
comments on this score. To quote from some of the letters,