CRANV2
Aslib Cranfield Research Project: Factors Determining the Performance of Indexing Systems: Volume 2
Test Environment
chapter
Cyril Cleverdon
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
- 9 -
Index languages
As described in Vo1. I, Chapter 5, the languages tested fall into
three main groups:
I Single Terms, with the base being the natural language concept
indexing split into unit terms,
II Simple Concepts, with the base also being the natural language
concept indexing, with some of the more complex pre-coordinated concepts
split into simple concepts,
III Controlled Terms, with the base being the controlled vocabulary
derived from the E.J.C. Thesaurus, and indexing performed by translating
the natural language concepts into the controlled vocabulary.
In defining any particular index language, these three main types
will be denoted by the Roman numerals I, II and III; the various sets
of recall devices tested are denoted by Arabic numerals and the
precision devices by lower case letters.
!!~!
Recall devices
The starting point of each series of tests is the use of the basic
terms as indexed. From this base, various recall and precision devices
are added, both separately and in different aggregates. In the single term
languages, four different recall devices were tested, namely control of
synonyms, confounding of word forms, control of quasi-synonyms and
control of clusters of terms by means of reduced vocabularies based on
hierarchies. A total of eight aggregates was tested, and a tree diagram
giving details of the eight languages is given in Fig. 2.5.
1.1
NATURAL LANGUAGE
I
1 1,2 + FIRST HIERAHCH-
I,l + WORD FORMS [OCRerr] ICAL REDUCTION
ICAL REDUCTION
I[OCRerr]/6 I1 9
1:3 + 1.5 1.8 + THIRD HIERARCH-
ICAL REDUCTION
FIGURE 2.5 SINGL'E TERM INDEX LANGUAGES