CRANV1P1
ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text
Formation of Index Languages
chapter
Cyril Cleverdon
Jack Mills
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
-71 -
In the same way, there were numerous examples of terms which appeared to
represent operations or processes (if one regarded only the single terms in isola-
tion) but which represented an integral part of the specification of a particular kind
of thing; e.g. Settling chamber, Drivin[OCRerr] gas, Non-lifting wing, Geared elevator.
Wherever such a term had appeared only in that particular context and its function
as a class determinant had been to characterize the entity and not the operation,
property, etc., as such, it was subordinated in the hierarchy to the entity which it
specified.
The exact status of these variants on insertion into the hierarchies created a
slight, theoretical problem. The con{ounding of synonyms in an earlier programme
had already established what terms were exactly synonymous and it would have been
inconsistent now to add these variants as synonyms (the weakness of a synonym pro-
gramme derived before the establishment of a classification has already been noted}.
So they were simply clustered together as though coordinate in relation to each other.
Had the measurements of single-term hierarchical linkage taken the same form as
in the later 'concept hierarchies', whereby various hierarchical trails were followed
in order to distinguish sharply between different relations (subordinate, superordinate,
coordinate, etc. ): this might have produced a very slight distortion of the performance
figures. However, the measurement of single-term hierarchies only took the form of
block-reductions in vocabulary size (in the manner discussed earlier in this chaper), so
no harm was done.
It must be admitted that a few errors crept in, when unjustified violence was
done to a category by the subordination of one of its members to another category.
For example, in the overwhelming majority of cases, the term Revolution occurred
in indexing as part of[OCRerr]ody of Revolution'; so, according to the reasoning above, it
was located in the category of Shape, since its function was to designate a particular
kind of shape. However, its synonym, Rotation, occurred once or twice in its funda-
mental guise of a process; it is therefore misplaced under Shape. It is not thought
that these occasional lapses were serious. We have already seen that in making single
term hierarchies, if a term is relegated to a fundamental category this results in
classes sometimes being drawn in which are unhelpfully associated; this is also what
happens in the case of a lapse like the above.
Construction of single term hierarchies
Having settled on the various solutions to the problems described above, the
formidable task of organizing the 3094 terms of the natural language proceeded. The
basic operation was one of facet analysis (a facet being a hierarchy). A useful frame-
work for the initial sorting was the Facet Classification compiled for the first Aslib-
Cranfield Project by J. Farradane and B. C. Vickery, althoughhigh speed aerodynamics
(the subject of this test collection} tended to concentrate itself in only a few of the
areas covered by the scheme, and was in far greater detail than had been handled
before. Particularly large categories were those relating to Bodies, to Shapes, and
various Spatial and general relations, to Fluid dynamics proper, with particular
clusters of detail under such topics as Compressors, Upper atmosphere studies,
and Astronautics. The speed with which the last subject has developed in recent
years was reflected in the fact that whereas the Facet Classification barely mentioned
it, in this test collection it was a major theme.
Because no attempt was made to establish 'fundamental' categories as such,
the common categories which were formed tended to be residual ones in that they
contained only those terms which had not found a place in a more limited context.