CRANV2 Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2 Test Environment chapter Cyril Cleverdon Michael Keen Cranfield An investigation supported by a grant to Aslib by the National Science Foundation. Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. - 9 - Index languages As described in Vo1. I, Chapter 5, the languages tested fall into three main groups: I Single Terms, with the base being the natural language concept indexing split into unit terms, II Simple Concepts, with the base also being the natural language concept indexing, with some of the more complex pre-coordinated concepts split into simple concepts, III Controlled Terms, with the base being the controlled vocabulary derived from the E.J.C. Thesaurus, and indexing performed by translating the natural language concepts into the controlled vocabulary. In defining any particular index language, these three main types will be denoted by the Roman numerals I, II and III; the various sets of recall devices tested are denoted by Arabic numerals and the precision devices by lower case letters. !!~! Recall devices The starting point of each series of tests is the use of the basic terms as indexed. From this base, various recall and precision devices are added, both separately and in different aggregates. In the single term languages, four different recall devices were tested, namely control of synonyms, confounding of word forms, control of quasi-synonyms and control of clusters of terms by means of reduced vocabularies based on hierarchies. A total of eight aggregates was tested, and a tree diagram giving details of the eight languages is given in Fig. 2.5. 1.1 NATURAL LANGUAGE I 1 1,2 + FIRST HIERAHCH- I,l + WORD FORMS [OCRerr] ICAL REDUCTION ICAL REDUCTION I[OCRerr]/6 I1 9 1:3 + 1.5 1.8 + THIRD HIERARCH- ICAL REDUCTION FIGURE 2.5 SINGL'E TERM INDEX LANGUAGES