CRANV1P1 ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text Formation of Index Languages chapter Cyril Cleverdon Jack Mills Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. - 66 - explained on p. 70. The others are now distributed under the more general categories as explained: i.e., terms like Smoke, Vapour, Screen, Fog, etc. appear in other contexts as well and are therefore placed in more general categories. Under these conditions, as soon as reduction of the original full vocabulary begins, it becomes very difficult to maintain the sensible boundaries of a class like Visualization tests. For if a question on this were now programmed to included such terms as Fog, etc., it is very likely that these in turn have been swallowed up in the reduction of the general categories and that their inclusion in the search programme can only be had at the cost of bringing in a number of other terms, such as Cloud, Snow, etc. which are quite irrelevant to the context of Visualization tests. Another drawback, related to the foregoing, is the loss of connection suffered by terms treated in isolation and not in coordination. For example, a search in res- ponse to a question on 'Flow in channels, would fail to draw in documents indexed by 'Couette flow' or 'Poiseuille flow'. Although there is a clear connection between these at the 'concept level' of types of flow, at the level of single terms there is no con- nection between Channel (treated as a Structure affording a passage) and the personal names Couette and PoiseuiHe. This situation reflects a practical difficulty in post- coordinate systems which rely on single terms - that of indicating connections (in a thesaurus, say) when these connections are dependent on particular conjunctions; e.g. this would imply a reference of the kind: Channel: when coordinated with Flow see also: Couette Flow Poiseuille Flow. It is important, therefore, to remember that the performance results of the single- term hierarchies reflect the use of one particular application of hierarchy as a recall device - i.e. , its expansion of classes by fixed reduction of vocabulary size. Also, that this was a procedure determined largely by considerations of measurement rather than regard for the normal use of hierarchy as a recall device in practical inde:.ing. There seems little doubt now thai it is a mistake to regard hierarchy as an obligatory recall device. Its essential function is to act as a permissive device, allowing flexible choice of class adjustment according to the demands of the question context in a way which is' not feasible within the artificial conditions of single-term hierarchies. From this viewpoint, the performance figures for the concept hierarchies described in the next section are a better guide to the value of generic hierarchy as an indexing device. Languages based on single-terms and embodying recall devices Before describing these in detail it may be noted that a certain artificiality inevitably accompanies the application of recall devices to single terms in isolation, simply because, in many cases, words make little sense when stripped of accompany- ing qualifiers, etc. For example, the problem of synonymity in index languages fre- quently demands recognition of phrases, as when 'Ground effect machine' is equated with 'Air cushion vehicle' although at the single term level there is no synonymity between the constituent terms; and a term.like 'effect' on its own is practically value- less as a retrieval handle (which is what any class, in indexing, aims to be). Traditional, pre coordinate indexing has always begun with some degree of coordination. Even in analytico-synthetic classifications, where 'elementary constituent terms' are separated out as far as possible, there is no rigid adherence to the single term as the basis of the language; for example, 'Ground effect machine' would be com- fortably accommodated in a Vehicles facet. But coordination of terms is an extreme)-