IRE Information Retrieval Experiment Laboratory tests of manual systems chapter E. Michael Keen Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 8 Laboratory tests of manual systems E. Michael Keen 9 I 8.1 Introduction The essence of manual retrieval Systems is that all operations of storage and retrieval are carried out by humans directly, with no more aid than the time- honoured record sheets, index cards, printed pages, and so on. Tests of in- house systems have covered several kinds of library catalogue, pre-coordinate index, and post-coordinate index. Tests of published systems have compared many styles of printed subject index. In many cases the computer is now used in the construction of both in-house and published systems, but the characteristics of the final product and its use have been affected very little, with the indexing and searching processes still manual in character. Even in systems where a computer is used in matching a formulated search against indexed documents,just as much human skill will be involved, and interactive searching requires human judgement of the highest quality. Though this chapter will concentrate on fully manual systems, some work in these semi- automated areas will be referred to. Laboratory evaluation testing of manual systems began with the first Cranfield project1' 2 With the special library catalogue or index in mind, the traditional index languages (Universal Decimal Classification and Alpha- betical Subject Headings) were being challenged by Faceted Classification and the Uniterm system of post-coordinate indexing. So a four-way comparison was mounted. Using a realistically large document collection four test indexes were constructed along practical lines, then laboratory controls were introduced to generate the search requests, identify relevant documents, conduct the test searches, and score the search results. It can be seen now that this approach represents a mid-point in laboratory test techniques between a deeply artificial test of highly controlled subsystems and the testing of real-world systems under conditions of controlled laboratory searching. The deep laboratory approach was soon to be exemplified in the second Cranfield project3 4, where the index language variants tested covered many linguistic forms, and where the searching was so rigorously controlled as to be machine-like or `unintelligent'. The other extreme of test had begun already with a comparison of two systems by D. R. Swanson5, in which one system in particular was analogous to real subject indexes and was subject to 136