<DOC> <DOCNO> IRE </DOCNO> <TITLE> Information Retrieval Experiment </TITLE> <SUBTITLE> Laboratory tests: automatic systems </SUBTITLE> <TYPE> chapter </TYPE> <PAGE CHAPTER="9" NUMBER="170"> <AUTHOR1> Robert N. Oddy </AUTHOR1> <PUBLISHER> Butterworth & Company </PUBLISHER> <EDITOR1> Karen Sparck Jones </EDITOR1> <COPYRIGHT MTH="" DAY="" YEAR="1981" BY="Butterworth & Company"> All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. </COPYRIGHT> <BODY> 170 Laboratory tests: automatic Systems I tests? It is that, in principle, they are inconclusive. An example should illustrate the point. Experiments on relevance feedback8-10, which exhibit relatively outstand I- ing effectiveness, make use of the same set of relevance judgements for two q separate purposes. First, they are used to simulate the user's feedback, that q is his reactions to documents retrieved, and thus they determine the query modification. Second, they are used to evaluate the effectiveness of the q technique. In real life, two distinct sets of relevance judgements would be used for a corresponding experiment. As he sits at the terminal, the enquirer would make instantiudgements, according to his perception of the documents U during the search session. His evaluation of the search would be made at a I later time, on reflection. Theories are unrealistic in this respect, and U experimental arrangements fulfil their unrealistic assumptions, and are thus i inconclusive in relation to real life systems. What laboratories tests do is to isolate small portions of large, complex systems for independent study. This is a procedure which, when applied to systems with human components, has been strongly criticized49' 50 General systems theorists point out that the interactions between the components of I a system are profound and cannot be ignored or artificially controlled if the I system's behaviour is to be understood[OCRerr]'the whole is greater than the sum of its parts', they are wont to say. Simon51 describes artifacts (information retrieval programs, for instance) as relatively simple organisms whose I behaviour is nevertheless complex because they react to a complex I environment. Knowing how they react to a simple or controlled environment I does not necessarily help us very much. Modern mathematical theories of I information retrieval have arisen out of extensive experimental experience I mainly with laboratory programs. (That is perfectly reasonable, and U represents a worthy benefit of the earlier laboratory tests.) Unfortunately, I tests of the assumptions and predictions of the theories have been mostly confined to the same laboratory environment. Partridge's52 comment on I difficult software engineering tasks applies to information retrieval: `A more I or less ``wicked'' problem is often the initial given and, although it is top I priority to transform this problem into a formal analogue before proceeding further, the subsequent implementation swims or sinks in the light of real E world application: that is, back in the domain of "wicked" problems' (p. q 244). My view is that in this field, the type of laboratories I have described are not the best places in which to put theories to the test[OCRerr]unless we are only U- interested in theories of test collections! They do, however, have a valuable role as information retrieval engineers' and theoreticians' workshops, where I U ideas can develop and tentative or exploratory tests can be made. Coupled with this, I would advocate the development of methodologies which I U facilitate real life testing of the assumptions and consequences of promising theories53. I feel uneasy about solutions to the realism problem which involve larger and better planned test collections23 and repetition of tests on a number of different test collections6' 8 9.6 Model building The automatic information retrieval laboratory environment is used for another type of computer-based activity[OCRerr]that of model building. Because Ii </BODY> </PAGE> </DOC>