IRE Information Retrieval Experiment The pragmatics of information retrieval experimentation chapter Jean M. Tague Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 60 The pragmatics of information retrieval experimentation The approach to information retrieval testing in this chapter will be to step through an information retrieval test procedure, indicating, at each step, the choices that will face the experimenter. Suggestions will be made for resolving these in ways that take into account the validity, reliability, and efficiency of the experiment. It is assumed that the experimenter has decided what is to be tested, bearing in mind the problems discussed in the three preceding chapters, and can clearly distinguish this from the assumptions she/he is making. 5.1 Decision 1: To test or not to test? It should be unnecessary to point out to information scientists the necessity of a thorough literature search before embarking on any experimentation at all. Unfortunately, even in this field one finds attempts to reinvent the wheel. Library and Information Science Abstracts, the Annual Review of Information Science and Technology, Information Science Abstracts, Library Literature, Computing Reviews, Computer and Control Abstracts, Dissertation Abstracts are required reading prior to planning. Although the actual experiment may not have been attempted previously, some partial or suggestive results may be available. Previous papers frequently bring to the attention of the investigator useful methodology or even sets of queries and evaluations. Many writers have pointed out the need for cumulative studies in information retrieval. Only a thorough grounding in previous research will make this possible. 5.2 Decision 2: What liiind of test? This decision relates to the broad category of test. Will it be a laboratory or operational test? Will it be a complete or partial test? Cleverdon1 made these distinctions clearly in Cranfield 2. Operational tests normally involve an evaluation of an existing system, laboratory tests attempt to advance knowledge about individual variables of information retrieval. Complete tests involve three aspects: a collection of documents, a set of search requests, and relevance judgements relating the search request to the documents. Partial tests, as the name implies, usually are concerned with aspects of the document set other than retrieval. The characteristics of the various types of test are discussed in more detail in Chapter 2. Although the purpose of the test will influence the choice of laboratory versus operational system, other factors must be considered. If a laboratory test is to be mounted, does the experimenter have the time, the people, and the funding to carry it out? Laboratory tests tend to be more expensive than operational tests of the same size. On the other hand, tests of operational systems should not be attempted unless there is an assurance of co-operation from the operational personnel, not just top-level management. In this regard, face-to-face conversation or at least telephone contact is more effective than written correspondence. I I U t I ii I