CRANV1P1
ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text
Formation of Index Languages
chapter
Cyril Cleverdon
Jack Mills
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 66 -
explained on p. 70. The others are now distributed under the more general categories
as explained: i.e., terms like Smoke, Vapour, Screen, Fog, etc. appear in other
contexts as well and are therefore placed in more general categories. Under these
conditions, as soon as reduction of the original full vocabulary begins, it becomes
very difficult to maintain the sensible boundaries of a class like Visualization tests.
For if a question on this were now programmed to included such terms as Fog, etc.,
it is very likely that these in turn have been swallowed up in the reduction of the general
categories and that their inclusion in the search programme can only be had at the
cost of bringing in a number of other terms, such as Cloud, Snow, etc. which are quite
irrelevant to the context of Visualization tests.
Another drawback, related to the foregoing, is the loss of connection suffered
by terms treated in isolation and not in coordination. For example, a search in res-
ponse to a question on 'Flow in channels, would fail to draw in documents indexed by
'Couette flow' or 'Poiseuille flow'. Although there is a clear connection between these
at the 'concept level' of types of flow, at the level of single terms there is no con-
nection between Channel (treated as a Structure affording a passage) and the personal
names Couette and PoiseuiHe. This situation reflects a practical difficulty in post-
coordinate systems which rely on single terms - that of indicating connections (in a
thesaurus, say) when these connections are dependent on particular conjunctions; e.g.
this would imply a reference of the kind:
Channel: when coordinated with Flow
see also: Couette Flow
Poiseuille Flow.
It is important, therefore, to remember that the performance results of the single-
term hierarchies reflect the use of one particular application of hierarchy as a recall
device - i.e. , its expansion of classes by fixed reduction of vocabulary size. Also,
that this was a procedure determined largely by considerations of measurement rather
than regard for the normal use of hierarchy as a recall device in practical inde:.ing.
There seems little doubt now thai it is a mistake to regard hierarchy as an obligatory
recall device. Its essential function is to act as a permissive device, allowing flexible
choice of class adjustment according to the demands of the question context in a way
which is' not feasible within the artificial conditions of single-term hierarchies.
From this viewpoint, the performance figures for the concept hierarchies described
in the next section are a better guide to the value of generic hierarchy as an indexing
device.
Languages based on single-terms and embodying recall devices
Before describing these in detail it may be noted that a certain artificiality
inevitably accompanies the application of recall devices to single terms in isolation,
simply because, in many cases, words make little sense when stripped of accompany-
ing qualifiers, etc. For example, the problem of synonymity in index languages fre-
quently demands recognition of phrases, as when 'Ground effect machine' is equated
with 'Air cushion vehicle' although at the single term level there is no synonymity
between the constituent terms; and a term.like 'effect' on its own is practically value-
less as a retrieval handle (which is what any class, in indexing, aims to be).
Traditional, pre coordinate indexing has always begun with some degree of
coordination. Even in analytico-synthetic classifications, where 'elementary constituent
terms' are separated out as far as possible, there is no rigid adherence to the single
term as the basis of the language; for example, 'Ground effect machine' would be com-
fortably accommodated in a Vehicles facet. But coordination of terms is an extreme)-