CRANV1P1 ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text General Considerations chapter Cyril Cleverdon Jack Mills Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. apart from a fundamental characteristic, and whatever the type of index language, it can readily be provided with a complete list of sought terms, that is a "lead-in vocabulary". The second requirement for index languages is a set of index terms, while a third requirement is a set of code terms. Before attempting to explain the differences, it must first be said that in many index languages there will be some terms which will occur in the triple role of a lead-in term. an index term and a code term. Further, all index terms must be lead-in terms, and frequently the set of index terms will be the same as the set of code terms. -For examples of the three types of terms, the Thesaurus of the Engineers Joint Council can be considered (Ref. 27). A lead-in term represents a concept which is described by another term than itself. This may represent a synonym, e.g. Speed use Velocity, or may be a subordination of a specific term to a more general term, e.g. Hexagonal use Shape. Code terms are those terms which are actually used in indexing, examples being Velocity, Rotation, Engine noise. Jet engines. Index terms are all Code terms, and additionally any combinations of Code terms which make up and express new concepts. For instance, the Index term 'Peripheral speed' is expressed by the use of the two Code terms Rotation and Velocity, while the Index term 'Jet engine noise' is expressed by the use of the Code terms Jet engines and Engine Noise. While these three types of terms, i.e. , lead-in terms, index terms and code terms, are normal ingredients of an index language, most index languages also make use of auxiliary devices or aids. In a completely simple system, lead-in terms would always be the index terms and the code terms, which is to say that terms would be used exactly as they appeared in the literature. As soon as the set of index terms is fewer in number than the set of lead-in terms, then a measure of control has been introduced. This normally takes the form of combining terms which are synonyms, and is only the first of many devices which are used in various ways to make up different index languages. There is nothing exclusive about such devices which res- trict their use to any particular type of index language; ; precoordinate or post- coordinate, alphabetical or classified, any type of index language can potentially be given the same devices and thereby have the operational performance of any other index language. In his book "On retrieval system theory", (ref. 9), Vickery identified seventeen devices, and acknowledgement must be made that in the original project proposal, these formed the basis of our argument. Vickery lists these devices as follows. Means of control Field of use 1. No control. Some amateur alphabetical indexes. 2. Rigid control - fixed vocabulary Some mechanized systems with limited coding of descriptors, capacity. 3. Confounding of variant word forms. Professional alphabetical indexes, including Uniterm, and most other systems. 4. Confounding of true synonyms. Ditto. 5. Confounding of near synonyms. Some subject heading lists, some classi- fications, and systems based on thesauri.