CRANV1P1
ASLIB Cranfield Research Project: Factors Determining the Performance of Indexing Systems: VOLUME 1. Design, Part 1. Text
Testing Techniques
chapter
Cyril Cleverdon
Jack Mills
Michael Keen
Cranfield
An investigation supported by a grant to Aslib by the National Science Foundation.
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
-90 -
CHAPTER 6
Testing Techniques
The choice of the physical method to be used for searching was important,
but difficult to make. Since the work was entirely concerned with index languages,
it was essential that the physical form of the index should in no way impede the
investigation by introducing any controls or restrictions of its own. Although it
was not possible to forecast exactly the many different tests that would be made,
it was clear that for each question there would be the necessity of obtaining
several hundred sets of performance figures.
It was decided that a small test should be made soon after the project had
commenced; this was to be done partly to check the indexing procedures but also
to validate the proposed design of the tests and to provide experience that would
assist in deciding on the physical form of the index. For this pilot test, 116
docun[OCRerr] :.,is had been indexed, and fourteen questions were available for searching,
for which there were 26 known relevant documents. It was planned to investigate
five sets of recall devices and four sets of precision devices, based on the single-
term, natural language indexing. These variables alone appeared likely to result
in some 80 searches for every question, and when other variables were added in
the main test, the potential number of different searches could run into several
hundreds. It was unlikely that every combination of the various devices would be
required, but the method used had to be flexible enough to provide for all possible
variations of searches, since it would only be after some searches had been made
that it would be known which were unnecessary.
Co-ordination was certainly the basic precision device, and some form of
post-coordinate index was clearly required. For the pilot test, the decision was
taken to prepare a peek-a-boo type index. This was done in a conventional way,
but a complication arose due to the fact that, at this stage of the work, six different
indexing weights were being used, and, to investigate the effect of these, it was
necessary to have, for every term, six cards each of which represented a different
weighting.
The first search for a given question was carried out on the natural
ianguage terms . Subsequent searches were made bringing in the various recall
devices and precision devices; the nature of these searches is considered in more
detail later in this chapter. The results of this test were interesting in themselves,
but the main objective had been to obtain information on the techniques being used.
In this respect, the test showed that the general test theory was reasonable and that
the indexing was satisfactory for the objectives of the test. Quite definitely, however,
it showed that a peek-a-boo index would be quite unsatisfactory for the main test.
This was because much of the testing involved use of increasingly large
numbers of terms in the search as the recall devices were tested, with the continual
need for co-ordination of all the different combinations. For example, if a question
had five terms searched on initially, and each of the five terms had one synonym,
two word forms and four quasi-synon[OCRerr]ns, then in coordination of all five terms
using all the recall devices, 32,768 different combinations are possible. After this,
it would be necessary to search for any four of the five sets of terms, then any three
and so on. It is true that by use of the lowest posted terms first, the number of
coordinations to be done can be reduced considerably, but the use of natural language