<DOC> <DOCNO> IRE </DOCNO> <TITLE> Information Retrieval Experiment </TITLE> <SUBTITLE> The methodology of information retrieval experiment </SUBTITLE> <TYPE> chapter </TYPE> <PAGE CHAPTER="2" NUMBER="10"> <AUTHOR1> Stephen E. Robertson </AUTHOR1> <PUBLISHER> Butterworth & Company </PUBLISHER> <EDITOR1> Karen Sparck Jones </EDITOR1> <COPYRIGHT MTH="" DAY="" YEAR="1981" BY="Butterworth & Company"> All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. </COPYRIGHT> <BODY> 10 The methodology of information retrieval experiment vast majority of experiments have used the scientific paper as the normal unit.) Request (or query) has usually been taken to mean the statement by the requester describing his/her information need, but recently (particularly with the development of systems such as on-line which allow immediate feedback) has come to mean simply the act of requesting. It is usually assumed that this act is stimulated by an underlying need for information, which in some sense remains invariant, though the requester's perception and/or description of it may change in the course of his/her interaction with the system. User and requester are synonymous. The notion of testing has already been discussed, as has the distinction between experiment and investigation, in the editor's introduction. A distinction is usually made between systems for current awareness or the selective dissemination of information (SDI) and those for retrospective retrieval. In terms of the mechanics of the system, in retrospective retrieval a request is made to a system as a one-off occurrence, and searched against the current collection of documents; in SDI, repeated searches are made against successive additions to the document collection, over a period of time. What is the purpose or function of an information retrieval system: what is it supposed to do? The simple answer to this question is to retrieve documents in response to requests; but this is too simple, any arbitrary gadget could do that. The documents must be of a particular kind: that is, they must serve the user's purpose. Since (we assume) the user's purpose is to satisfy an information need, we might describe the function of an information retrieval system as `leading the user to those documents that will best enable him/her to satisfy his/her need for information'. There are many different aspects or properties of a system that one might want to measure or observe, but most of them are concerned with the effectiveness of the system, or its benefits, or its efficiency. Effectiveness is how well the system does what it is supposed to do; its benefits are the gains deriving from what the system does in some wider context; its efficiency is how cheaply it does what it does. In this book, we are mainly, but not exclusively, concerned with the effectiveness or (synonymously) the perfor- mance of information retrieval systems. Why test information retrieval systems? This book is mainly about the `how' of testing. But before we launch into the technicalities of how best to conduct a test, we should (at least briefly) consider the prior question of why. Starting from the simplest situation, suppose that we have a specified clientele and document collection, and two existing information retrieval systems working in parallel, and we wish to decide which of the two to drop. Then we could imagine conducting a formal experiment to help us make this particular decision. In principle, such testing would be relatively straightfor- ward: with a well-defined, specific question to answer, we would have the ideal experimental situation. The problems become more complex if, instead of two alternative systems, we have one system which we think might be capable of improvement. In this situation, we might for instance want to evaluate how well it performs </BODY> </PAGE> </DOC>